Overview

Dataset statistics

Number of variables36
Number of observations588053
Missing cells2085003
Missing cells (%)9.8%
Duplicate rows963
Duplicate rows (%)0.2%
Total size in memory161.5 MiB
Average record size in memory288.0 B

Variable types

Categorical25
Numeric10
Unsupported1

Dataset

DescriptionThis profiling report was generated as part of Misiriya's KaggleX BIPOC project
URL

Alerts

AsOfDate has constant value "20200930"Constant
Program has constant value "7A"Constant
Dataset has 963 (0.2%) duplicate rowsDuplicates
BorrName has a high cardinality: 477025 distinct valuesHigh cardinality
BorrStreet has a high cardinality: 510557 distinct valuesHigh cardinality
BorrCity has a high cardinality: 26981 distinct valuesHigh cardinality
BorrState has a high cardinality: 58 distinct valuesHigh cardinality
BankName has a high cardinality: 3258 distinct valuesHigh cardinality
BankStreet has a high cardinality: 3469 distinct valuesHigh cardinality
BankCity has a high cardinality: 2184 distinct valuesHigh cardinality
BankState has a high cardinality: 55 distinct valuesHigh cardinality
ApprovalDate has a high cardinality: 3761 distinct valuesHigh cardinality
FirstDisbursementDate has a high cardinality: 3530 distinct valuesHigh cardinality
NaicsDescription has a high cardinality: 1220 distinct valuesHigh cardinality
FranchiseCode has a high cardinality: 5444 distinct valuesHigh cardinality
FranchiseName has a high cardinality: 4768 distinct valuesHigh cardinality
ProjectCounty has a high cardinality: 1889 distinct valuesHigh cardinality
ProjectState has a high cardinality: 58 distinct valuesHigh cardinality
SBADistrictOffice has a high cardinality: 74 distinct valuesHigh cardinality
PaidInFullDate has a high cardinality: 130 distinct valuesHigh cardinality
ChargeOffDate has a high cardinality: 2184 distinct valuesHigh cardinality
BorrZip is highly overall correlated with BorrState and 3 other fieldsHigh correlation
GrossApproval is highly overall correlated with SBAGuaranteedApproval and 1 other fieldsHigh correlation
SBAGuaranteedApproval is highly overall correlated with GrossApproval and 1 other fieldsHigh correlation
TermInMonths is highly overall correlated with GrossApproval and 1 other fieldsHigh correlation
CongressionalDistrict is highly overall correlated with SBADistrictOfficeHigh correlation
BorrState is highly overall correlated with BorrZip and 3 other fieldsHigh correlation
BankState is highly overall correlated with BorrZip and 3 other fieldsHigh correlation
DeliveryMethod is highly overall correlated with subpgmdesc and 1 other fieldsHigh correlation
subpgmdesc is highly overall correlated with DeliveryMethod and 1 other fieldsHigh correlation
ProjectState is highly overall correlated with BorrZip and 3 other fieldsHigh correlation
SBADistrictOffice is highly overall correlated with BorrZip and 4 other fieldsHigh correlation
RevolverStatus is highly overall correlated with DeliveryMethod and 1 other fieldsHigh correlation
DeliveryMethod is highly imbalanced (50.3%)Imbalance
subpgmdesc is highly imbalanced (59.8%)Imbalance
BusinessType is highly imbalanced (61.6%)Imbalance
FirstDisbursementDate has 82091 (14.0%) missing valuesMissing
FranchiseCode has 534676 (90.9%) missing valuesMissing
FranchiseName has 534790 (90.9%) missing valuesMissing
PaidInFullDate has 365212 (62.1%) missing valuesMissing
ChargeOffDate has 567384 (96.5%) missing valuesMissing
GrossChargeOffAmount is highly skewed (γ1 = 29.43380104)Skewed
BorrName is uniformly distributedUniform
BorrStreet is uniformly distributedUniform
BankZip is an unsupported type, check if it needs cleaning or further analysisUnsupported
CongressionalDistrict has 19793 (3.4%) zerosZeros
GrossChargeOffAmount has 567384 (96.5%) zerosZeros
JobsSupported has 93989 (16.0%) zerosZeros

Reproduction

Analysis started2023-03-20 13:15:38.489317
Analysis finished2023-03-20 13:17:21.193447
Duration1 minute and 42.7 seconds
Software versionpandas-profiling v0.0.dev0
Download configurationconfig.json

Variables

AsOfDate
Categorical

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
20200930
588053 

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4704424
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row20200930
2nd row20200930
3rd row20200930
4th row20200930
5th row20200930

Common Values

ValueCountFrequency (%)
20200930 588053
100.0%

Length

2023-03-20T18:47:21.292153image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-20T18:47:21.409958image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
20200930 588053
100.0%

Most occurring characters

ValueCountFrequency (%)
0 2352212
50.0%
2 1176106
25.0%
9 588053
 
12.5%
3 588053
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4704424
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 2352212
50.0%
2 1176106
25.0%
9 588053
 
12.5%
3 588053
 
12.5%

Most occurring scripts

ValueCountFrequency (%)
Common 4704424
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 2352212
50.0%
2 1176106
25.0%
9 588053
 
12.5%
3 588053
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4704424
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 2352212
50.0%
2 1176106
25.0%
9 588053
 
12.5%
3 588053
 
12.5%

Program
Categorical

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
7A
588053 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1176106
Distinct characters2
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row7A
2nd row7A
3rd row7A
4th row7A
5th row7A

Common Values

ValueCountFrequency (%)
7A 588053
100.0%

Length

2023-03-20T18:47:21.488734image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-20T18:47:21.787933image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
7a 588053
100.0%

Most occurring characters

ValueCountFrequency (%)
7 588053
50.0%
A 588053
50.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 588053
50.0%
Uppercase Letter 588053
50.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 588053
100.0%
Uppercase Letter
ValueCountFrequency (%)
A 588053
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 588053
50.0%
Latin 588053
50.0%

Most frequent character per script

Common
ValueCountFrequency (%)
7 588053
100.0%
Latin
ValueCountFrequency (%)
A 588053
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1176106
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7 588053
50.0%
A 588053
50.0%

BorrName
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct477025
Distinct (%)81.1%
Missing37
Missing (%)< 0.1%
Memory size4.5 MiB
Entity to be formed
 
45
The Cove at River Oaks, LLC
 
42
Anytime Fitness
 
42
Sport Clips
 
37
Subway
 
34
Other values (477020)
587816 

Length

Max length30
Median length23
Mean length21.91991
Min length1

Characters and Unicode

Total characters12889258
Distinct characters94
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique394960 ?
Unique (%)67.2%

Sample

1st rowCRESA PARTNERS - DENVER, INC.
2nd rowThe Hilltop Tavern
3rd rowRiver City Car Wash LLC
4th rowAlphagraphics
5th rowON SITE AUTOMOTIVE APPEARANCE

Common Values

ValueCountFrequency (%)
Entity to be formed 45
 
< 0.1%
The Cove at River Oaks, LLC 42
 
< 0.1%
Anytime Fitness 42
 
< 0.1%
Sport Clips 37
 
< 0.1%
Subway 34
 
< 0.1%
Briar Rose Estates, LLC 33
 
< 0.1%
HARDRIVES CONSTRUCTION INC 24
 
< 0.1%
Silly Zak's Gluten Free Bakery 21
 
< 0.1%
Vision Logistics Inc. 18
 
< 0.1%
Rock River Laboratory, Inc. 18
 
< 0.1%
Other values (477015) 587702
99.9%
(Missing) 37
 
< 0.1%

Length

2023-03-20T18:47:21.900664image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
llc 212975
 
10.2%
inc 157989
 
7.6%
43744
 
2.1%
and 23507
 
1.1%
enterprises 15347
 
0.7%
group 14758
 
0.7%
services 14464
 
0.7%
the 13915
 
0.7%
of 11126
 
0.5%
company 9974
 
0.5%
Other values (175762) 1568671
75.2%

Most occurring characters

ValueCountFrequency (%)
1500385
 
11.6%
e 680673
 
5.3%
L 648760
 
5.0%
n 613572
 
4.8%
C 549330
 
4.3%
a 542809
 
4.2%
r 509387
 
4.0%
i 484366
 
3.8%
o 472812
 
3.7%
t 413909
 
3.2%
Other values (84) 6473255
50.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5983217
46.4%
Uppercase Letter 4842952
37.6%
Space Separator 1500385
 
11.6%
Other Punctuation 498744
 
3.9%
Decimal Number 46171
 
0.4%
Dash Punctuation 13186
 
0.1%
Open Punctuation 2381
 
< 0.1%
Close Punctuation 1851
 
< 0.1%
Math Symbol 314
 
< 0.1%
Modifier Symbol 20
 
< 0.1%
Other values (2) 37
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 680673
11.4%
n 613572
10.3%
a 542809
9.1%
r 509387
 
8.5%
i 484366
 
8.1%
o 472812
 
7.9%
t 413909
 
6.9%
s 380909
 
6.4%
c 306531
 
5.1%
l 291045
 
4.9%
Other values (17) 1287204
21.5%
Uppercase Letter
ValueCountFrequency (%)
L 648760
13.4%
C 549330
11.3%
I 400320
 
8.3%
S 326625
 
6.7%
E 321354
 
6.6%
A 310180
 
6.4%
R 266426
 
5.5%
N 266212
 
5.5%
T 253978
 
5.2%
O 205620
 
4.2%
Other values (16) 1294147
26.7%
Other Punctuation
ValueCountFrequency (%)
, 239296
48.0%
. 179243
35.9%
& 50555
 
10.1%
' 22102
 
4.4%
; 3166
 
0.6%
/ 2516
 
0.5%
# 853
 
0.2%
: 389
 
0.1%
? 256
 
0.1%
! 237
 
< 0.1%
Other values (5) 131
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 8756
19.0%
2 8156
17.7%
3 5531
12.0%
0 5315
11.5%
4 4571
9.9%
5 3669
7.9%
6 2649
 
5.7%
8 2595
 
5.6%
9 2474
 
5.4%
7 2455
 
5.3%
Math Symbol
ValueCountFrequency (%)
+ 299
95.2%
> 7
 
2.2%
< 4
 
1.3%
= 3
 
1.0%
| 1
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 2371
99.6%
[ 9
 
0.4%
{ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 1843
99.6%
] 8
 
0.4%
Modifier Symbol
ValueCountFrequency (%)
` 18
90.0%
^ 2
 
10.0%
Space Separator
ValueCountFrequency (%)
1500385
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13186
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 19
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10826169
84.0%
Common 2063089
 
16.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 680673
 
6.3%
L 648760
 
6.0%
n 613572
 
5.7%
C 549330
 
5.1%
a 542809
 
5.0%
r 509387
 
4.7%
i 484366
 
4.5%
o 472812
 
4.4%
t 413909
 
3.8%
I 400320
 
3.7%
Other values (43) 5510231
50.9%
Common
ValueCountFrequency (%)
1500385
72.7%
, 239296
 
11.6%
. 179243
 
8.7%
& 50555
 
2.5%
' 22102
 
1.1%
- 13186
 
0.6%
1 8756
 
0.4%
2 8156
 
0.4%
3 5531
 
0.3%
0 5315
 
0.3%
Other values (31) 30564
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12889257
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1500385
 
11.6%
e 680673
 
5.3%
L 648760
 
5.0%
n 613572
 
4.8%
C 549330
 
4.3%
a 542809
 
4.2%
r 509387
 
4.0%
i 484366
 
3.8%
o 472812
 
3.7%
t 413909
 
3.2%
Other values (83) 6473254
50.2%
None
ValueCountFrequency (%)
ú 1
100.0%

BorrStreet
Categorical

HIGH CARDINALITY  UNIFORM 

Distinct510557
Distinct (%)86.8%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
TBD
 
67
Main Street
 
45
1111 Parana
 
24
PO BOX
 
18
Main St
 
17
Other values (510552)
587882 

Length

Max length30
Median length24
Mean length19.306285
Min length1

Characters and Unicode

Total characters11353119
Distinct characters87
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique448838 ?
Unique (%)76.3%

Sample

1st row7979 E TUFTS AVE PKWY STE 810
2nd row4757 Folsom Blvd
3rd row649 Harbor Blvd
4th row71 Newtown Road.
5th row603 WOODBRIDGE COURT

Common Values

ValueCountFrequency (%)
TBD 67
 
< 0.1%
Main Street 45
 
< 0.1%
1111 Parana 24
 
< 0.1%
PO BOX 18
 
< 0.1%
Main St 17
 
< 0.1%
MAIN ST 15
 
< 0.1%
21107 Omega Circle 15
 
< 0.1%
2781 Kikihau St 15
 
< 0.1%
400 Hana Highway A1 14
 
< 0.1%
7207 CANTERWOOD PL 14
 
< 0.1%
Other values (510547) 587809
> 99.9%

Length

2023-03-20T18:47:22.051265image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
st 71502
 
3.1%
street 65691
 
2.9%
ave 64589
 
2.8%
rd 56175
 
2.5%
road 55062
 
2.4%
suite 38234
 
1.7%
dr 34927
 
1.5%
w 32453
 
1.4%
n 31993
 
1.4%
s 31946
 
1.4%
Other values (93873) 1796979
78.8%

Most occurring characters

ValueCountFrequency (%)
1696023
 
14.9%
e 593794
 
5.2%
1 490793
 
4.3%
t 432535
 
3.8%
0 383839
 
3.4%
S 381283
 
3.4%
a 349288
 
3.1%
r 336053
 
3.0%
2 322438
 
2.8%
E 289771
 
2.6%
Other values (77) 6077302
53.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3829869
33.7%
Uppercase Letter 3126859
27.5%
Decimal Number 2530562
22.3%
Space Separator 1696023
14.9%
Other Punctuation 155841
 
1.4%
Dash Punctuation 13636
 
0.1%
Open Punctuation 175
 
< 0.1%
Close Punctuation 93
 
< 0.1%
Modifier Symbol 24
 
< 0.1%
Math Symbol 15
 
< 0.1%
Other values (2) 22
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 593794
15.5%
t 432535
11.3%
a 349288
9.1%
r 336053
8.8%
o 276569
 
7.2%
n 260969
 
6.8%
i 254439
 
6.6%
l 195079
 
5.1%
d 190308
 
5.0%
u 141834
 
3.7%
Other values (16) 799001
20.9%
Uppercase Letter
ValueCountFrequency (%)
S 381283
12.2%
E 289771
 
9.3%
R 285348
 
9.1%
A 275181
 
8.8%
T 197979
 
6.3%
N 184559
 
5.9%
D 174870
 
5.6%
L 145024
 
4.6%
W 133325
 
4.3%
C 125606
 
4.0%
Other values (16) 933913
29.9%
Other Punctuation
ValueCountFrequency (%)
. 104621
67.1%
, 26781
 
17.2%
# 19031
 
12.2%
& 3439
 
2.2%
/ 1054
 
0.7%
' 571
 
0.4%
; 181
 
0.1%
? 113
 
0.1%
: 25
 
< 0.1%
@ 14
 
< 0.1%
Other values (3) 11
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 490793
19.4%
0 383839
15.2%
2 322438
12.7%
5 246514
9.7%
3 246295
9.7%
4 215262
8.5%
6 171167
 
6.8%
7 162564
 
6.4%
8 149007
 
5.9%
9 142683
 
5.6%
Math Symbol
ValueCountFrequency (%)
+ 11
73.3%
= 2
 
13.3%
< 1
 
6.7%
> 1
 
6.7%
Modifier Symbol
ValueCountFrequency (%)
` 22
91.7%
^ 2
 
8.3%
Space Separator
ValueCountFrequency (%)
1696023
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13636
100.0%
Open Punctuation
ValueCountFrequency (%)
( 175
100.0%
Close Punctuation
ValueCountFrequency (%)
) 93
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 14
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6956728
61.3%
Common 4396391
38.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 593794
 
8.5%
t 432535
 
6.2%
S 381283
 
5.5%
a 349288
 
5.0%
r 336053
 
4.8%
E 289771
 
4.2%
R 285348
 
4.1%
o 276569
 
4.0%
A 275181
 
4.0%
n 260969
 
3.8%
Other values (42) 3475937
50.0%
Common
ValueCountFrequency (%)
1696023
38.6%
1 490793
 
11.2%
0 383839
 
8.7%
2 322438
 
7.3%
5 246514
 
5.6%
3 246295
 
5.6%
4 215262
 
4.9%
6 171167
 
3.9%
7 162564
 
3.7%
8 149007
 
3.4%
Other values (25) 312489
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11353119
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1696023
 
14.9%
e 593794
 
5.2%
1 490793
 
4.3%
t 432535
 
3.8%
0 383839
 
3.4%
S 381283
 
3.4%
a 349288
 
3.1%
r 336053
 
3.0%
2 322438
 
2.8%
E 289771
 
2.6%
Other values (77) 6077302
53.5%

BorrCity
Categorical

Distinct26981
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
LOS ANGELES
 
4451
HOUSTON
 
3858
NEW YORK
 
3356
CHICAGO
 
2965
BROOKLYN
 
2404
Other values (26976)
571019 

Length

Max length30
Median length27
Mean length8.8787014
Min length1

Characters and Unicode

Total characters5221147
Distinct characters76
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8160 ?
Unique (%)1.4%

Sample

1st rowDENVER
2nd rowSacramento
3rd rowWest Sacramento
4th rowDanbury
5th rowMIDDLEBURY

Common Values

ValueCountFrequency (%)
LOS ANGELES 4451
 
0.8%
HOUSTON 3858
 
0.7%
NEW YORK 3356
 
0.6%
CHICAGO 2965
 
0.5%
BROOKLYN 2404
 
0.4%
SAN DIEGO 2213
 
0.4%
DENVER 2200
 
0.4%
DALLAS 2106
 
0.4%
LAS VEGAS 1988
 
0.3%
PORTLAND 1968
 
0.3%
Other values (26971) 560544
95.3%

Length

2023-03-20T18:47:22.197839image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
city 14248
 
1.9%
san 13124
 
1.7%
new 9388
 
1.2%
beach 6862
 
0.9%
park 6820
 
0.9%
lake 6700
 
0.9%
los 6668
 
0.9%
angeles 6360
 
0.8%
houston 5789
 
0.8%
fort 5499
 
0.7%
Other values (12448) 681210
89.3%

Most occurring characters

ValueCountFrequency (%)
A 331257
 
6.3%
E 302588
 
5.8%
N 272567
 
5.2%
O 266799
 
5.1%
L 258577
 
5.0%
R 222752
 
4.3%
S 218519
 
4.2%
I 197836
 
3.8%
e 185021
 
3.5%
T 181114
 
3.5%
Other values (66) 2784117
53.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3346837
64.1%
Lowercase Letter 1695400
32.5%
Space Separator 174654
 
3.3%
Other Punctuation 2286
 
< 0.1%
Open Punctuation 812
 
< 0.1%
Decimal Number 543
 
< 0.1%
Close Punctuation 500
 
< 0.1%
Dash Punctuation 114
 
< 0.1%
Modifier Symbol 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 331257
 
9.9%
E 302588
 
9.0%
N 272567
 
8.1%
O 266799
 
8.0%
L 258577
 
7.7%
R 222752
 
6.7%
S 218519
 
6.5%
I 197836
 
5.9%
T 181114
 
5.4%
C 137672
 
4.1%
Other values (16) 957156
28.6%
Lowercase Letter
ValueCountFrequency (%)
e 185021
10.9%
a 178448
10.5%
n 157881
9.3%
o 157541
9.3%
l 137810
 
8.1%
i 127018
 
7.5%
r 125874
 
7.4%
t 107256
 
6.3%
s 90458
 
5.3%
d 55336
 
3.3%
Other values (16) 372757
22.0%
Decimal Number
ValueCountFrequency (%)
4 104
19.2%
1 78
14.4%
0 70
12.9%
2 62
11.4%
3 54
9.9%
5 41
 
7.6%
9 38
 
7.0%
6 36
 
6.6%
7 33
 
6.1%
8 27
 
5.0%
Other Punctuation
ValueCountFrequency (%)
. 1601
70.0%
, 413
 
18.1%
' 221
 
9.7%
? 18
 
0.8%
/ 16
 
0.7%
& 8
 
0.3%
; 3
 
0.1%
# 3
 
0.1%
: 3
 
0.1%
Space Separator
ValueCountFrequency (%)
174654
100.0%
Open Punctuation
ValueCountFrequency (%)
( 812
100.0%
Close Punctuation
ValueCountFrequency (%)
) 500
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 114
100.0%
Modifier Symbol
ValueCountFrequency (%)
` 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5042237
96.6%
Common 178910
 
3.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 331257
 
6.6%
E 302588
 
6.0%
N 272567
 
5.4%
O 266799
 
5.3%
L 258577
 
5.1%
R 222752
 
4.4%
S 218519
 
4.3%
I 197836
 
3.9%
e 185021
 
3.7%
T 181114
 
3.6%
Other values (42) 2605207
51.7%
Common
ValueCountFrequency (%)
174654
97.6%
. 1601
 
0.9%
( 812
 
0.5%
) 500
 
0.3%
, 413
 
0.2%
' 221
 
0.1%
- 114
 
0.1%
4 104
 
0.1%
1 78
 
< 0.1%
0 70
 
< 0.1%
Other values (14) 343
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5221147
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 331257
 
6.3%
E 302588
 
5.8%
N 272567
 
5.2%
O 266799
 
5.1%
L 258577
 
5.0%
R 222752
 
4.3%
S 218519
 
4.2%
I 197836
 
3.8%
e 185021
 
3.5%
T 181114
 
3.5%
Other values (66) 2784117
53.3%

BorrState
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct58
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
CA
73442 
TX
44854 
NY
37443 
OH
 
33449
FL
 
28681
Other values (53)
370184 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1176106
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowCO
2nd rowCA
3rd rowCA
4th rowCT
5th rowIN

Common Values

ValueCountFrequency (%)
CA 73442
 
12.5%
TX 44854
 
7.6%
NY 37443
 
6.4%
OH 33449
 
5.7%
FL 28681
 
4.9%
MI 22977
 
3.9%
MA 20755
 
3.5%
IL 19445
 
3.3%
PA 18836
 
3.2%
MN 16981
 
2.9%
Other values (48) 271190
46.1%

Length

2023-03-20T18:47:22.317631image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca 73442
 
12.5%
tx 44854
 
7.6%
ny 37443
 
6.4%
oh 33449
 
5.7%
fl 28681
 
4.9%
mi 22977
 
3.9%
ma 20755
 
3.5%
il 19445
 
3.3%
pa 18836
 
3.2%
mn 16981
 
2.9%
Other values (48) 271190
46.1%

Most occurring characters

ValueCountFrequency (%)
A 183967
15.6%
N 119588
10.2%
C 111834
 
9.5%
M 94419
 
8.0%
I 87937
 
7.5%
T 74089
 
6.3%
O 72655
 
6.2%
L 56887
 
4.8%
X 44854
 
3.8%
Y 43944
 
3.7%
Other values (14) 285932
24.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1176106
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 183967
15.6%
N 119588
10.2%
C 111834
 
9.5%
M 94419
 
8.0%
I 87937
 
7.5%
T 74089
 
6.3%
O 72655
 
6.2%
L 56887
 
4.8%
X 44854
 
3.8%
Y 43944
 
3.7%
Other values (14) 285932
24.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1176106
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 183967
15.6%
N 119588
10.2%
C 111834
 
9.5%
M 94419
 
8.0%
I 87937
 
7.5%
T 74089
 
6.3%
O 72655
 
6.2%
L 56887
 
4.8%
X 44854
 
3.8%
Y 43944
 
3.7%
Other values (14) 285932
24.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1176106
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 183967
15.6%
N 119588
10.2%
C 111834
 
9.5%
M 94419
 
8.0%
I 87937
 
7.5%
T 74089
 
6.3%
O 72655
 
6.2%
L 56887
 
4.8%
X 44854
 
3.8%
Y 43944
 
3.7%
Other values (14) 285932
24.3%

BorrZip
Real number (ℝ)

Distinct24334
Distinct (%)4.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean52318.167
Minimum601
Maximum99929
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:22.442298image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum601
5-th percentile3031
Q127954
median50613
Q380238
95-th percentile96150
Maximum99929
Range99328
Interquartile range (IQR)52284

Descriptive statistics

Standard deviation30775.38
Coefficient of variation (CV)0.58823506
Kurtosis-1.2879899
Mean52318.167
Median Absolute Deviation (MAD)28141
Skewness-0.083580435
Sum3.0765855 × 1010
Variance9.47124 × 108
MonotonicityNot monotonic
2023-03-20T18:47:22.577968image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10001 540
 
0.1%
90021 414
 
0.1%
75034 395
 
0.1%
33166 371
 
0.1%
93401 355
 
0.1%
46032 352
 
0.1%
10018 349
 
0.1%
85260 345
 
0.1%
90015 341
 
0.1%
84115 335
 
0.1%
Other values (24324) 584256
99.4%
ValueCountFrequency (%)
601 10
 
< 0.1%
602 64
< 0.1%
603 93
< 0.1%
604 2
 
< 0.1%
605 14
 
< 0.1%
606 2
 
< 0.1%
610 44
< 0.1%
612 73
< 0.1%
613 2
 
< 0.1%
614 6
 
< 0.1%
ValueCountFrequency (%)
99929 3
 
< 0.1%
99925 3
 
< 0.1%
99921 7
 
< 0.1%
99919 1
 
< 0.1%
99901 54
< 0.1%
99841 1
 
< 0.1%
99840 16
 
< 0.1%
99835 27
< 0.1%
99833 8
 
< 0.1%
99832 5
 
< 0.1%

BankName
Categorical

Distinct3258
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
Wells Fargo Bank, National Association
46635 
The Huntington National Bank
 
38631
JPMorgan Chase Bank, National Association
 
36302
U.S. Bank, National Association
 
27943
TD Bank, National Association
 
16908
Other values (3253)
421634 

Length

Max length60
Median length50
Mean length25.526407
Min length3

Characters and Unicode

Total characters15010880
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique445 ?
Unique (%)0.1%

Sample

1st rowZions Bank, A Division of
2nd rowPlumas Bank
3rd rowWells Fargo Bank, National Association
4th rowUnion Savings Bank
5th rowVelocitySBA, LLC

Common Values

ValueCountFrequency (%)
Wells Fargo Bank, National Association 46635
 
7.9%
The Huntington National Bank 38631
 
6.6%
JPMorgan Chase Bank, National Association 36302
 
6.2%
U.S. Bank, National Association 27943
 
4.8%
TD Bank, National Association 16908
 
2.9%
Manufacturers and Traders Trust Company 13877
 
2.4%
BBVA USA 11992
 
2.0%
Celtic Bank Corporation 8811
 
1.5%
Live Oak Banking Company 8565
 
1.5%
Truist Bank d/b/a Branch Banking & Trust Co 8279
 
1.4%
Other values (3248) 370110
62.9%

Length

2023-03-20T18:47:22.734516image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
bank 481364
22.2%
national 245889
 
11.3%
association 186924
 
8.6%
the 53049
 
2.4%
fargo 46690
 
2.2%
wells 46644
 
2.1%
of 43280
 
2.0%
trust 41643
 
1.9%
huntington 38631
 
1.8%
chase 36302
 
1.7%
Other values (2239) 949400
43.8%

Most occurring characters

ValueCountFrequency (%)
a 1737693
 
11.6%
1581857
 
10.5%
n 1542405
 
10.3%
i 1084555
 
7.2%
o 1073535
 
7.2%
t 883163
 
5.9%
s 787292
 
5.2%
B 608361
 
4.1%
e 575456
 
3.8%
k 551047
 
3.7%
Other values (65) 4585516
30.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10701953
71.3%
Uppercase Letter 2397225
 
16.0%
Space Separator 1581857
 
10.5%
Other Punctuation 322605
 
2.1%
Decimal Number 4322
 
< 0.1%
Dash Punctuation 2331
 
< 0.1%
Open Punctuation 284
 
< 0.1%
Close Punctuation 284
 
< 0.1%
Math Symbol 19
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1737693
16.2%
n 1542405
14.4%
i 1084555
10.1%
o 1073535
10.0%
t 883163
8.3%
s 787292
7.4%
e 575456
 
5.4%
k 551047
 
5.1%
l 474548
 
4.4%
r 451393
 
4.2%
Other values (16) 1540866
14.4%
Uppercase Letter
ValueCountFrequency (%)
B 608361
25.4%
N 274017
11.4%
A 253137
10.6%
C 221792
 
9.3%
T 147367
 
6.1%
F 136670
 
5.7%
S 135201
 
5.6%
U 80087
 
3.3%
P 77080
 
3.2%
M 76859
 
3.2%
Other values (16) 386654
16.1%
Decimal Number
ValueCountFrequency (%)
1 2975
68.8%
2 549
 
12.7%
3 198
 
4.6%
0 194
 
4.5%
4 164
 
3.8%
5 112
 
2.6%
6 89
 
2.1%
7 23
 
0.5%
8 18
 
0.4%
Other Punctuation
ValueCountFrequency (%)
, 207675
64.4%
. 73937
 
22.9%
/ 19259
 
6.0%
& 18580
 
5.8%
' 3150
 
1.0%
\ 2
 
< 0.1%
" 2
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
+ 9
47.4%
< 5
26.3%
> 5
26.3%
Space Separator
ValueCountFrequency (%)
1581857
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2331
100.0%
Open Punctuation
ValueCountFrequency (%)
( 284
100.0%
Close Punctuation
ValueCountFrequency (%)
) 284
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13099178
87.3%
Common 1911702
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1737693
13.3%
n 1542405
11.8%
i 1084555
 
8.3%
o 1073535
 
8.2%
t 883163
 
6.7%
s 787292
 
6.0%
B 608361
 
4.6%
e 575456
 
4.4%
k 551047
 
4.2%
l 474548
 
3.6%
Other values (42) 3781123
28.9%
Common
ValueCountFrequency (%)
1581857
82.7%
, 207675
 
10.9%
. 73937
 
3.9%
/ 19259
 
1.0%
& 18580
 
1.0%
' 3150
 
0.2%
1 2975
 
0.2%
- 2331
 
0.1%
2 549
 
< 0.1%
( 284
 
< 0.1%
Other values (13) 1105
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15010880
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1737693
 
11.6%
1581857
 
10.5%
n 1542405
 
10.3%
i 1084555
 
7.2%
o 1073535
 
7.2%
t 883163
 
5.9%
s 787292
 
5.2%
B 608361
 
4.1%
e 575456
 
3.8%
k 551047
 
3.7%
Other values (65) 4585516
30.5%

BankStreet
Categorical

Distinct3469
Distinct (%)0.6%
Missing4
Missing (%)< 0.1%
Memory size4.5 MiB
101 N Philips Ave
46635 
17 S High St
 
38631
1111 Polaris Pkwy
 
36302
425 Walnut St
 
27943
2035 Limestone Rd
 
16908
Other values (3464)
421630 

Length

Max length30
Median length27
Mean length16.855216
Min length8

Characters and Unicode

Total characters9911693
Distinct characters74
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique477 ?
Unique (%)0.1%

Sample

1st row1 S Main St
2nd row336 W Main St
3rd row101 N Philips Ave
4th row226 Main St
5th row9385 Haven Avenue

Common Values

ValueCountFrequency (%)
101 N Philips Ave 46635
 
7.9%
17 S High St 38631
 
6.6%
1111 Polaris Pkwy 36302
 
6.2%
425 Walnut St 27943
 
4.8%
2035 Limestone Rd 16908
 
2.9%
One M & T Plaza, 15th Fl 13877
 
2.4%
15 S 20th St 11992
 
2.0%
268 S State St, Ste 300 8811
 
1.5%
1741 Tiburon Dr 8565
 
1.5%
214 N Tryon St 8279
 
1.4%
Other values (3459) 370106
62.9%

Length

2023-03-20T18:47:22.889103image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
st 254015
 
11.2%
ave 101252
 
4.4%
n 100419
 
4.4%
s 100021
 
4.4%
ste 63642
 
2.8%
101 53433
 
2.3%
main 48136
 
2.1%
philips 46635
 
2.0%
pkwy 39720
 
1.7%
high 39129
 
1.7%
Other values (3179) 1431577
62.8%

Most occurring characters

ValueCountFrequency (%)
1689937
17.0%
1 632448
 
6.4%
t 578993
 
5.8%
S 482726
 
4.9%
e 474284
 
4.8%
i 423755
 
4.3%
0 423374
 
4.3%
a 371394
 
3.7%
n 330798
 
3.3%
l 310763
 
3.1%
Other values (64) 4193221
42.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4378249
44.2%
Decimal Number 2109100
21.3%
Space Separator 1689937
 
17.0%
Uppercase Letter 1598294
 
16.1%
Other Punctuation 121141
 
1.2%
Math Symbol 9305
 
0.1%
Dash Punctuation 5665
 
0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 578993
13.2%
e 474284
10.8%
i 423755
9.7%
a 371394
 
8.5%
n 330798
 
7.6%
l 310763
 
7.1%
r 279740
 
6.4%
s 219466
 
5.0%
o 219408
 
5.0%
h 206675
 
4.7%
Other values (16) 962973
22.0%
Uppercase Letter
ValueCountFrequency (%)
S 482726
30.2%
P 175100
 
11.0%
A 130875
 
8.2%
N 118165
 
7.4%
M 96061
 
6.0%
W 88439
 
5.5%
H 65913
 
4.1%
R 58065
 
3.6%
B 53426
 
3.3%
F 50951
 
3.2%
Other values (16) 278573
17.4%
Decimal Number
ValueCountFrequency (%)
1 632448
30.0%
0 423374
20.1%
2 234938
 
11.1%
5 192540
 
9.1%
7 136417
 
6.5%
4 135467
 
6.4%
3 130178
 
6.2%
9 80696
 
3.8%
8 74890
 
3.6%
6 68152
 
3.2%
Other Punctuation
ValueCountFrequency (%)
, 92959
76.7%
& 15015
 
12.4%
. 11459
 
9.5%
' 953
 
0.8%
# 547
 
0.5%
/ 208
 
0.2%
Math Symbol
ValueCountFrequency (%)
< 4653
50.0%
> 4652
50.0%
Space Separator
ValueCountFrequency (%)
1689937
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5665
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5976543
60.3%
Common 3935150
39.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 578993
 
9.7%
S 482726
 
8.1%
e 474284
 
7.9%
i 423755
 
7.1%
a 371394
 
6.2%
n 330798
 
5.5%
l 310763
 
5.2%
r 279740
 
4.7%
s 219466
 
3.7%
o 219408
 
3.7%
Other values (42) 2285216
38.2%
Common
ValueCountFrequency (%)
1689937
42.9%
1 632448
 
16.1%
0 423374
 
10.8%
2 234938
 
6.0%
5 192540
 
4.9%
7 136417
 
3.5%
4 135467
 
3.4%
3 130178
 
3.3%
, 92959
 
2.4%
9 80696
 
2.1%
Other values (12) 186196
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9911693
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1689937
17.0%
1 632448
 
6.4%
t 578993
 
5.8%
S 482726
 
4.9%
e 474284
 
4.8%
i 423755
 
4.3%
0 423374
 
4.3%
a 371394
 
3.7%
n 330798
 
3.3%
l 310763
 
3.1%
Other values (64) 4193221
42.3%

BankCity
Categorical

Distinct2184
Distinct (%)0.4%
Missing4
Missing (%)< 0.1%
Memory size4.5 MiB
COLUMBUS
76399 
SIOUX FALLS
50551 
WILMINGTON
 
34735
CINCINNATI
 
32285
SALT LAKE CITY
 
16659
Other values (2179)
377420 

Length

Max length24
Median length21
Mean length9.4283691
Min length3

Characters and Unicode

Total characters5544343
Distinct characters53
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique250 ?
Unique (%)< 0.1%

Sample

1st rowSALT LAKE CITY
2nd rowQUINCY
3rd rowSIOUX FALLS
4th rowDANBURY
5th rowRancho Cucamonga

Common Values

ValueCountFrequency (%)
COLUMBUS 76399
 
13.0%
SIOUX FALLS 50551
 
8.6%
WILMINGTON 34735
 
5.9%
CINCINNATI 32285
 
5.5%
SALT LAKE CITY 16659
 
2.8%
LOS ANGELES 16486
 
2.8%
BIRMINGHAM 14120
 
2.4%
BUFFALO 13967
 
2.4%
CHARLOTTE 10841
 
1.8%
CLEVELAND 7874
 
1.3%
Other values (2174) 314132
53.4%

Length

2023-03-20T18:47:23.016761image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
columbus 76504
 
9.5%
falls 51276
 
6.4%
sioux 50606
 
6.3%
wilmington 34801
 
4.3%
cincinnati 32285
 
4.0%
city 28343
 
3.5%
lake 22569
 
2.8%
salt 16709
 
2.1%
los 16693
 
2.1%
angeles 16693
 
2.1%
Other values (2008) 458012
56.9%

Most occurring characters

ValueCountFrequency (%)
L 517427
 
9.3%
A 440024
 
7.9%
I 437492
 
7.9%
N 407047
 
7.3%
O 402274
 
7.3%
S 399668
 
7.2%
E 322974
 
5.8%
C 292276
 
5.3%
U 285975
 
5.2%
T 261815
 
4.7%
Other values (43) 1777371
32.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 5110099
92.2%
Lowercase Letter 217670
 
3.9%
Space Separator 216442
 
3.9%
Other Punctuation 132
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
L 517427
 
10.1%
A 440024
 
8.6%
I 437492
 
8.6%
N 407047
 
8.0%
O 402274
 
7.9%
S 399668
 
7.8%
E 322974
 
6.3%
C 292276
 
5.7%
U 285975
 
5.6%
T 261815
 
5.1%
Other values (16) 1343127
26.3%
Lowercase Letter
ValueCountFrequency (%)
a 31836
14.6%
n 25219
11.6%
o 19412
8.9%
e 17907
 
8.2%
r 14489
 
6.7%
u 13607
 
6.3%
g 13011
 
6.0%
c 12906
 
5.9%
t 12635
 
5.8%
i 10345
 
4.8%
Other values (14) 46303
21.3%
Other Punctuation
ValueCountFrequency (%)
' 81
61.4%
. 51
38.6%
Space Separator
ValueCountFrequency (%)
216442
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5327769
96.1%
Common 216574
 
3.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
L 517427
 
9.7%
A 440024
 
8.3%
I 437492
 
8.2%
N 407047
 
7.6%
O 402274
 
7.6%
S 399668
 
7.5%
E 322974
 
6.1%
C 292276
 
5.5%
U 285975
 
5.4%
T 261815
 
4.9%
Other values (40) 1560797
29.3%
Common
ValueCountFrequency (%)
216442
99.9%
' 81
 
< 0.1%
. 51
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5544343
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
L 517427
 
9.3%
A 440024
 
7.9%
I 437492
 
7.9%
N 407047
 
7.3%
O 402274
 
7.3%
S 399668
 
7.2%
E 322974
 
5.8%
C 292276
 
5.3%
U 285975
 
5.2%
T 261815
 
4.7%
Other values (43) 1777371
32.1%

BankState
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct55
Distinct (%)< 0.1%
Missing4
Missing (%)< 0.1%
Memory size4.5 MiB
OH
121906 
CA
52910 
SD
51478 
NY
 
29944
DE
 
26274
Other values (50)
305537 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1176098
Distinct characters25
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowUT
2nd rowCA
3rd rowSD
4th rowCT
5th rowCA

Common Values

ValueCountFrequency (%)
OH 121906
20.7%
CA 52910
 
9.0%
SD 51478
 
8.8%
NY 29944
 
5.1%
DE 26274
 
4.5%
NC 23601
 
4.0%
UT 21333
 
3.6%
MA 20427
 
3.5%
TX 18702
 
3.2%
MN 14961
 
2.5%
Other values (45) 206513
35.1%

Length

2023-03-20T18:47:23.130458image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
oh 121906
20.7%
ca 52910
 
9.0%
sd 51478
 
8.8%
ny 29944
 
5.1%
de 26274
 
4.5%
nc 23601
 
4.0%
ut 21333
 
3.6%
ma 20427
 
3.5%
tx 18702
 
3.2%
mn 14961
 
2.5%
Other values (45) 206513
35.1%

Most occurring characters

ValueCountFrequency (%)
A 142276
12.1%
O 140820
12.0%
H 127063
10.8%
N 96635
 
8.2%
C 84415
 
7.2%
D 82931
 
7.1%
M 63925
 
5.4%
S 60839
 
5.2%
I 56898
 
4.8%
T 54877
 
4.7%
Other values (15) 265419
22.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1176097
> 99.9%
Lowercase Letter 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 142276
12.1%
O 140820
12.0%
H 127063
10.8%
N 96635
 
8.2%
C 84415
 
7.2%
D 82931
 
7.1%
M 63925
 
5.4%
S 60839
 
5.2%
I 56898
 
4.8%
T 54877
 
4.7%
Other values (14) 265418
22.6%
Lowercase Letter
ValueCountFrequency (%)
n 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1176098
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 142276
12.1%
O 140820
12.0%
H 127063
10.8%
N 96635
 
8.2%
C 84415
 
7.2%
D 82931
 
7.1%
M 63925
 
5.4%
S 60839
 
5.2%
I 56898
 
4.8%
T 54877
 
4.7%
Other values (15) 265419
22.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1176098
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 142276
12.1%
O 140820
12.0%
H 127063
10.8%
N 96635
 
8.2%
C 84415
 
7.2%
D 82931
 
7.1%
M 63925
 
5.4%
S 60839
 
5.2%
I 56898
 
4.8%
T 54877
 
4.7%
Other values (15) 265419
22.6%

BankZip
Unsupported

REJECTED  UNSUPPORTED 

Missing4
Missing (%)< 0.1%
Memory size4.5 MiB

GrossApproval
Real number (ℝ)

Distinct22149
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean393387.47
Minimum1000
Maximum5000000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:23.245178image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum1000
5-th percentile10000
Q145000
median137500
Q3381000
95-th percentile1762500
Maximum5000000
Range4999000
Interquartile range (IQR)336000

Descriptive statistics

Standard deviation695480.22
Coefficient of variation (CV)1.7679267
Kurtosis14.87773
Mean393387.47
Median Absolute Deviation (MAD)112500
Skewness3.5098264
Sum2.3133268 × 1011
Variance4.8369274 × 1011
MonotonicityNot monotonic
2023-03-20T18:47:23.403726image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
50000 38301
 
6.5%
25000 34004
 
5.8%
100000 29006
 
4.9%
150000 28342
 
4.8%
10000 20256
 
3.4%
350000 15196
 
2.6%
20000 14203
 
2.4%
15000 12876
 
2.2%
250000 11020
 
1.9%
30000 10421
 
1.8%
Other values (22139) 374428
63.7%
ValueCountFrequency (%)
1000 19
< 0.1%
1500 4
 
< 0.1%
1600 1
 
< 0.1%
1700 1
 
< 0.1%
1900 1
 
< 0.1%
2000 12
< 0.1%
2300 2
 
< 0.1%
2400 2
 
< 0.1%
2500 17
< 0.1%
2600 2
 
< 0.1%
ValueCountFrequency (%)
5000000 1898
0.3%
4999900 3
 
< 0.1%
4999700 1
 
< 0.1%
4999500 1
 
< 0.1%
4999000 1
 
< 0.1%
4998600 1
 
< 0.1%
4998200 1
 
< 0.1%
4998100 1
 
< 0.1%
4998000 4
 
< 0.1%
4997000 1
 
< 0.1%

SBAGuaranteedApproval
Real number (ℝ)

Distinct29376
Distinct (%)5.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean291622.72
Minimum500
Maximum6175000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:23.538366image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum500
5-th percentile5000
Q125000
median87500
Q3288750
95-th percentile1350000
Maximum6175000
Range6174500
Interquartile range (IQR)263750

Descriptive statistics

Standard deviation534271.76
Coefficient of variation (CV)1.832065
Kurtosis15.186445
Mean291622.72
Median Absolute Deviation (MAD)75000
Skewness3.5212083
Sum1.7148961 × 1011
Variance2.8544631 × 1011
MonotonicityNot monotonic
2023-03-20T18:47:23.718676image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25000 33927
 
5.8%
12500 30767
 
5.2%
50000 22195
 
3.8%
5000 18816
 
3.2%
127500 17082
 
2.9%
10000 13100
 
2.2%
7500 11719
 
2.0%
75000 10628
 
1.8%
15000 9580
 
1.6%
2500 8383
 
1.4%
Other values (29366) 411856
70.0%
ValueCountFrequency (%)
500 19
< 0.1%
750 4
 
< 0.1%
800 1
 
< 0.1%
850 1
 
< 0.1%
950 1
 
< 0.1%
1000 12
< 0.1%
1150 2
 
< 0.1%
1200 2
 
< 0.1%
1250 16
< 0.1%
1300 2
 
< 0.1%
ValueCountFrequency (%)
6175000 1
 
< 0.1%
5400000 1
 
< 0.1%
5250000 1
 
< 0.1%
5124526 1
 
< 0.1%
4556600 1
 
< 0.1%
4500000 276
< 0.1%
4498740 1
 
< 0.1%
4493718 1
 
< 0.1%
4492530 1
 
< 0.1%
4490370 1
 
< 0.1%

ApprovalDate
Categorical

Distinct3761
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
11-03-2010
 
1500
28-07-2015
 
1358
01-10-2010
 
1082
24-12-2009
 
1001
28-01-2019
 
976
Other values (3756)
582136 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5880530
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique197 ?
Unique (%)< 0.1%

Sample

1st row01-10-2009
2nd row01-10-2009
3rd row01-10-2009
4th row01-10-2009
5th row01-10-2009

Common Values

ValueCountFrequency (%)
11-03-2010 1500
 
0.3%
28-07-2015 1358
 
0.2%
01-10-2010 1082
 
0.2%
24-12-2009 1001
 
0.2%
28-01-2019 976
 
0.2%
22-07-2015 955
 
0.2%
30-09-2013 912
 
0.2%
20-11-2009 902
 
0.2%
21-07-2015 880
 
0.1%
23-07-2015 815
 
0.1%
Other values (3751) 577672
98.2%

Length

2023-03-20T18:47:23.878283image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
11-03-2010 1500
 
0.3%
28-07-2015 1358
 
0.2%
01-10-2010 1082
 
0.2%
24-12-2009 1001
 
0.2%
28-01-2019 976
 
0.2%
22-07-2015 955
 
0.2%
30-09-2013 912
 
0.2%
20-11-2009 902
 
0.2%
21-07-2015 880
 
0.1%
23-07-2015 815
 
0.1%
Other values (3751) 577672
98.2%

Most occurring characters

ValueCountFrequency (%)
0 1386831
23.6%
- 1176106
20.0%
1 1081879
18.4%
2 1027143
17.5%
3 185939
 
3.2%
9 178953
 
3.0%
7 172461
 
2.9%
6 170441
 
2.9%
5 169680
 
2.9%
8 167492
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4704424
80.0%
Dash Punctuation 1176106
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1386831
29.5%
1 1081879
23.0%
2 1027143
21.8%
3 185939
 
4.0%
9 178953
 
3.8%
7 172461
 
3.7%
6 170441
 
3.6%
5 169680
 
3.6%
8 167492
 
3.6%
4 163605
 
3.5%
Dash Punctuation
ValueCountFrequency (%)
- 1176106
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5880530
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1386831
23.6%
- 1176106
20.0%
1 1081879
18.4%
2 1027143
17.5%
3 185939
 
3.2%
9 178953
 
3.0%
7 172461
 
2.9%
6 170441
 
2.9%
5 169680
 
2.9%
8 167492
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5880530
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1386831
23.6%
- 1176106
20.0%
1 1081879
18.4%
2 1027143
17.5%
3 185939
 
3.2%
9 178953
 
3.0%
7 172461
 
2.9%
6 170441
 
2.9%
5 169680
 
2.9%
8 167492
 
2.8%

ApprovalFiscalYear
Real number (ℝ)

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2015.1043
Minimum2010
Maximum2020
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:24.003915image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum2010
5-th percentile2010
Q12013
median2015
Q32018
95-th percentile2020
Maximum2020
Range10
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.0331319
Coefficient of variation (CV)0.0015051985
Kurtosis-1.1194012
Mean2015.1043
Median Absolute Deviation (MAD)3
Skewness-0.11117726
Sum1.1849881 × 109
Variance9.1998893
MonotonicityIncreasing
2023-03-20T18:47:24.137557image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
2016 64074
10.9%
2015 63461
10.8%
2017 62430
10.6%
2018 60354
10.3%
2011 53710
9.1%
2014 52044
8.9%
2019 51907
8.8%
2010 47000
8.0%
2013 46395
7.9%
2012 44376
7.5%
ValueCountFrequency (%)
2010 47000
8.0%
2011 53710
9.1%
2012 44376
7.5%
2013 46395
7.9%
2014 52044
8.9%
2015 63461
10.8%
2016 64074
10.9%
2017 62430
10.6%
2018 60354
10.3%
2019 51907
8.8%
ValueCountFrequency (%)
2020 42302
7.2%
2019 51907
8.8%
2018 60354
10.3%
2017 62430
10.6%
2016 64074
10.9%
2015 63461
10.8%
2014 52044
8.9%
2013 46395
7.9%
2012 44376
7.5%
2011 53710
9.1%

FirstDisbursementDate
Categorical

HIGH CARDINALITY  MISSING 

Distinct3530
Distinct (%)0.7%
Missing82091
Missing (%)14.0%
Memory size4.5 MiB
30-06-2017
 
5104
31-10-2017
 
4985
30-09-2017
 
4928
31-08-2017
 
4914
31-05-2017
 
4882
Other values (3525)
481149 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters5059620
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique325 ?
Unique (%)0.1%

Sample

1st row01-10-2009
2nd row01-10-2009
3rd row01-11-2009
4th row01-10-2009
5th row01-10-2009

Common Values

ValueCountFrequency (%)
30-06-2017 5104
 
0.9%
31-10-2017 4985
 
0.8%
30-09-2017 4928
 
0.8%
31-08-2017 4914
 
0.8%
31-05-2017 4882
 
0.8%
31-03-2018 4715
 
0.8%
31-05-2018 4622
 
0.8%
31-12-2017 4547
 
0.8%
30-06-2018 4529
 
0.8%
31-08-2018 4504
 
0.8%
Other values (3520) 458232
77.9%
(Missing) 82091
 
14.0%

Length

2023-03-20T18:47:24.286177image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
30-06-2017 5104
 
1.0%
31-10-2017 4985
 
1.0%
30-09-2017 4928
 
1.0%
31-08-2017 4914
 
1.0%
31-05-2017 4882
 
1.0%
31-03-2018 4715
 
0.9%
31-05-2018 4622
 
0.9%
31-12-2017 4547
 
0.9%
30-06-2018 4529
 
0.9%
31-08-2018 4504
 
0.9%
Other values (3520) 458232
90.6%

Most occurring characters

ValueCountFrequency (%)
0 1323670
26.2%
1 1088211
21.5%
- 1011924
20.0%
2 709020
14.0%
3 250031
 
4.9%
8 118551
 
2.3%
6 116900
 
2.3%
7 113866
 
2.3%
5 112864
 
2.2%
9 108870
 
2.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4047696
80.0%
Dash Punctuation 1011924
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1323670
32.7%
1 1088211
26.9%
2 709020
17.5%
3 250031
 
6.2%
8 118551
 
2.9%
6 116900
 
2.9%
7 113866
 
2.8%
5 112864
 
2.8%
9 108870
 
2.7%
4 105713
 
2.6%
Dash Punctuation
ValueCountFrequency (%)
- 1011924
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5059620
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1323670
26.2%
1 1088211
21.5%
- 1011924
20.0%
2 709020
14.0%
3 250031
 
4.9%
8 118551
 
2.3%
6 116900
 
2.3%
7 113866
 
2.3%
5 112864
 
2.2%
9 108870
 
2.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5059620
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1323670
26.2%
1 1088211
21.5%
- 1011924
20.0%
2 709020
14.0%
3 250031
 
4.9%
8 118551
 
2.3%
6 116900
 
2.3%
7 113866
 
2.3%
5 112864
 
2.2%
9 108870
 
2.2%

DeliveryMethod
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
SBA EXPRES
280906 
PLP
195338 
OTH 7A
49625 
SLA
31522 
CA
 
6390
Other values (10)
 
24272

Length

Max length10
Median length10
Mean length6.780814
Min length2

Characters and Unicode

Total characters3987478
Distinct characters21
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSBA EXPRES
2nd rowPLP
3rd rowPLP
4th rowSBA EXPRES
5th rowCOMM EXPRS

Common Values

ValueCountFrequency (%)
SBA EXPRES 280906
47.8%
PLP 195338
33.2%
OTH 7A 49625
 
8.4%
SLA 31522
 
5.4%
CA 6390
 
1.1%
PATRIOT EX 6247
 
1.1%
COMM EXPRS 5468
 
0.9%
RLA 3096
 
0.5%
CLP 2653
 
0.5%
EWCP 1798
 
0.3%
Other values (5) 5010
 
0.9%

Length

2023-03-20T18:47:24.412853image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
expres 282362
30.2%
sba 280906
30.0%
plp 195338
20.9%
oth 49625
 
5.3%
7a 49625
 
5.3%
sla 31522
 
3.4%
ca 6390
 
0.7%
patriot 6247
 
0.7%
ex 6247
 
0.7%
exprs 5468
 
0.6%
Other values (11) 21443
 
2.3%

Most occurring characters

ValueCountFrequency (%)
P 690785
17.3%
S 601978
15.1%
E 583100
14.6%
A 379506
9.5%
347120
8.7%
R 300569
7.5%
X 295544
7.4%
B 280906
7.0%
L 234329
 
5.9%
T 65515
 
1.6%
Other values (11) 208126
 
5.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3590733
90.1%
Space Separator 347120
 
8.7%
Decimal Number 49625
 
1.2%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
P 690785
19.2%
S 601978
16.8%
E 583100
16.2%
A 379506
10.6%
R 300569
8.4%
X 295544
8.2%
B 280906
7.8%
L 234329
 
6.5%
T 65515
 
1.8%
O 64780
 
1.8%
Other values (9) 93721
 
2.6%
Space Separator
ValueCountFrequency (%)
347120
100.0%
Decimal Number
ValueCountFrequency (%)
7 49625
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3590733
90.1%
Common 396745
 
9.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
P 690785
19.2%
S 601978
16.8%
E 583100
16.2%
A 379506
10.6%
R 300569
8.4%
X 295544
8.2%
B 280906
7.8%
L 234329
 
6.5%
T 65515
 
1.8%
O 64780
 
1.8%
Other values (9) 93721
 
2.6%
Common
ValueCountFrequency (%)
347120
87.5%
7 49625
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3987478
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
P 690785
17.3%
S 601978
15.1%
E 583100
14.6%
A 379506
9.5%
347120
8.7%
R 300569
7.5%
X 295544
7.4%
B 280906
7.0%
L 234329
 
5.9%
T 65515
 
1.6%
Other values (11) 208126
 
5.2%

subpgmdesc
Categorical

HIGH CORRELATION  IMBALANCE 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
FA$TRK (Small Loan Express)
282368 
Guaranty
243090 
Lender Advantage Initiative
31521 
Community Advantage Initiative
 
6457
Patriot Express
 
6247
Other values (12)
 
18370

Length

Max length49
Median length27
Mean length18.942708
Min length8

Characters and Unicode

Total characters11139316
Distinct characters53
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFA$TRK (Small Loan Express)
2nd rowGuaranty
3rd rowGuaranty
4th rowFA$TRK (Small Loan Express)
5th rowCommunity Express

Common Values

ValueCountFrequency (%)
FA$TRK (Small Loan Express) 282368
48.0%
Guaranty 243090
41.3%
Lender Advantage Initiative 31521
 
5.4%
Community Advantage Initiative 6457
 
1.1%
Patriot Express 6247
 
1.1%
Community Express 5468
 
0.9%
Standard Asset Based 3394
 
0.6%
Rural Lender Advantage 3096
 
0.5%
Revolving Line of Credit Exports - Sec. 7(a) (14) 1800
 
0.3%
Gulf Opportunity 1720
 
0.3%
Other values (7) 2892
 
0.5%

Length

2023-03-20T18:47:24.531502image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
express 294083
18.8%
small 282571
18.1%
fa$trk 282368
18.1%
loan 282368
18.1%
guaranty 243703
15.6%
advantage 41074
 
2.6%
initiative 37978
 
2.4%
lender 34617
 
2.2%
community 11925
 
0.8%
patriot 6247
 
0.4%
Other values (30) 46379
 
3.0%

Most occurring characters

ValueCountFrequency (%)
a 1202206
 
10.8%
975260
 
8.8%
n 666884
 
6.0%
s 600685
 
5.4%
r 595180
 
5.3%
l 573853
 
5.2%
e 462738
 
4.2%
t 404206
 
3.6%
A 327156
 
2.9%
L 319398
 
2.9%
Other values (43) 5011750
45.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6602426
59.3%
Uppercase Letter 2681268
24.1%
Space Separator 975260
 
8.8%
Open Punctuation 289682
 
2.6%
Close Punctuation 289682
 
2.6%
Currency Symbol 282368
 
2.5%
Decimal Number 11191
 
0.1%
Dash Punctuation 3782
 
< 0.1%
Other Punctuation 3657
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 327156
12.2%
L 319398
11.9%
E 296144
11.0%
S 289860
10.8%
R 287547
10.7%
T 284103
10.6%
F 282493
10.5%
K 282368
10.5%
G 245578
9.2%
I 39713
 
1.5%
Other values (10) 26908
 
1.0%
Lowercase Letter
ValueCountFrequency (%)
a 1202206
18.2%
n 666884
10.1%
s 600685
9.1%
r 595180
9.0%
l 573853
8.7%
e 462738
 
7.0%
t 404206
 
6.1%
o 310761
 
4.7%
m 306421
 
4.6%
p 299323
 
4.5%
Other values (9) 1180169
17.9%
Decimal Number
ValueCountFrequency (%)
7 3657
32.7%
1 3502
31.3%
4 1800
16.1%
6 1702
15.2%
9 280
 
2.5%
5 125
 
1.1%
0 125
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 1955
53.5%
, 1702
46.5%
Space Separator
ValueCountFrequency (%)
975260
100.0%
Open Punctuation
ValueCountFrequency (%)
( 289682
100.0%
Close Punctuation
ValueCountFrequency (%)
) 289682
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 282368
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3782
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9283694
83.3%
Common 1855622
 
16.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1202206
 
12.9%
n 666884
 
7.2%
s 600685
 
6.5%
r 595180
 
6.4%
l 573853
 
6.2%
e 462738
 
5.0%
t 404206
 
4.4%
A 327156
 
3.5%
L 319398
 
3.4%
o 310761
 
3.3%
Other values (29) 3820627
41.2%
Common
ValueCountFrequency (%)
975260
52.6%
( 289682
 
15.6%
) 289682
 
15.6%
$ 282368
 
15.2%
- 3782
 
0.2%
7 3657
 
0.2%
1 3502
 
0.2%
. 1955
 
0.1%
4 1800
 
0.1%
, 1702
 
0.1%
Other values (4) 2232
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11139316
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1202206
 
10.8%
975260
 
8.8%
n 666884
 
6.0%
s 600685
 
5.4%
r 595180
 
5.3%
l 573853
 
5.2%
e 462738
 
4.2%
t 404206
 
3.6%
A 327156
 
2.9%
L 319398
 
2.9%
Other values (43) 5011750
45.0%

InitialInterestRate
Real number (ℝ)

Distinct1869
Distinct (%)0.3%
Missing1
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean6.5260795
Minimum0
Maximum13.5
Zeros21
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:24.663684image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4.5
Q15.5
median6
Q37.5
95-th percentile9.75
Maximum13.5
Range13.5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.5680676
Coefficient of variation (CV)0.24027712
Kurtosis0.80181718
Mean6.5260795
Median Absolute Deviation (MAD)0.75
Skewness0.83206164
Sum3837674.1
Variance2.458836
MonotonicityNot monotonic
2023-03-20T18:47:24.809323image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6 88470
 
15.0%
5.5 38324
 
6.5%
5.25 35171
 
6.0%
5.75 31785
 
5.4%
6.25 28211
 
4.8%
7 22706
 
3.9%
6.5 21248
 
3.6%
7.75 19919
 
3.4%
5 19804
 
3.4%
7.5 17968
 
3.1%
Other values (1859) 264446
45.0%
ValueCountFrequency (%)
0 21
 
< 0.1%
0.5 3
 
< 0.1%
0.66 1
 
< 0.1%
0.75 7
 
< 0.1%
0.85 1
 
< 0.1%
1 71
< 0.1%
1.01 1
 
< 0.1%
1.16 1
 
< 0.1%
1.2 1
 
< 0.1%
1.24 1
 
< 0.1%
ValueCountFrequency (%)
13.5 2
 
< 0.1%
13.25 3
 
< 0.1%
12.99 12
< 0.1%
12.75 3
 
< 0.1%
12.74 2
 
< 0.1%
12.5 3
 
< 0.1%
12.493 1
 
< 0.1%
12.49 4
 
< 0.1%
12.25 2
 
< 0.1%
12.24 4
 
< 0.1%

TermInMonths
Real number (ℝ)

Distinct335
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean123.53789
Minimum0
Maximum847
Zeros109
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:24.977843image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile36
Q184
median91
Q3120
95-th percentile300
Maximum847
Range847
Interquartile range (IQR)36

Descriptive statistics

Standard deviation79.111359
Coefficient of variation (CV)0.64038136
Kurtosis0.69057441
Mean123.53789
Median Absolute Deviation (MAD)29
Skewness1.3370841
Sum72646824
Variance6258.6071
MonotonicityNot monotonic
2023-03-20T18:47:25.153918image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
84 158893
27.0%
120 150221
25.5%
300 65482
11.1%
60 59789
 
10.2%
240 12754
 
2.2%
180 8711
 
1.5%
36 8429
 
1.4%
12 6489
 
1.1%
48 6405
 
1.1%
72 5965
 
1.0%
Other values (325) 104915
17.8%
ValueCountFrequency (%)
0 109
 
< 0.1%
1 182
 
< 0.1%
2 251
< 0.1%
3 324
0.1%
4 304
0.1%
5 269
< 0.1%
6 544
0.1%
7 277
< 0.1%
8 248
< 0.1%
9 260
< 0.1%
ValueCountFrequency (%)
847 1
 
< 0.1%
720 2
< 0.1%
400 1
 
< 0.1%
360 4
< 0.1%
352 1
 
< 0.1%
345 1
 
< 0.1%
343 1
 
< 0.1%
332 1
 
< 0.1%
328 2
< 0.1%
326 2
< 0.1%

NaicsCode
Real number (ℝ)

Distinct1256
Distinct (%)0.2%
Missing12
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean523739.05
Minimum111110
Maximum928120
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:25.318478image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum111110
5-th percentile237310
Q1424930
median541110
Q3624410
95-th percentile812111
Maximum928120
Range817010
Interquartile range (IQR)199480

Descriptive statistics

Standard deviation175451.23
Coefficient of variation (CV)0.33499742
Kurtosis-0.69789387
Mean523739.05
Median Absolute Deviation (MAD)116120
Skewness-0.17036997
Sum3.0798004 × 1011
Variance3.0783134 × 1010
MonotonicityNot monotonic
2023-03-20T18:47:25.457105image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
722511 22534
 
3.8%
722513 14979
 
2.5%
621210 11244
 
1.9%
484121 9798
 
1.7%
721110 9453
 
1.6%
713940 9149
 
1.6%
621111 8618
 
1.5%
561730 8461
 
1.4%
811111 8363
 
1.4%
238990 8341
 
1.4%
Other values (1246) 477101
81.1%
ValueCountFrequency (%)
111110 81
< 0.1%
111120 5
 
< 0.1%
111130 1
 
< 0.1%
111140 17
 
< 0.1%
111150 51
< 0.1%
111160 1
 
< 0.1%
111191 19
 
< 0.1%
111199 45
< 0.1%
111211 23
 
< 0.1%
111219 103
< 0.1%
ValueCountFrequency (%)
928120 2
 
< 0.1%
926150 9
< 0.1%
926140 1
 
< 0.1%
926130 3
 
< 0.1%
926120 2
 
< 0.1%
926110 3
 
< 0.1%
925120 1
 
< 0.1%
924120 1
 
< 0.1%
924110 11
< 0.1%
923130 7
< 0.1%

NaicsDescription
Categorical

Distinct1220
Distinct (%)0.2%
Missing742
Missing (%)0.1%
Memory size4.5 MiB
Full-Service Restaurants
 
30672
Limited-Service Restaurants
 
20473
Offices of Dentists
 
11244
General Freight Trucking, Long Distance, Truckload
 
9798
Hotels (except Casino Hotels) and Motels
 
9453
Other values (1215)
505671 

Length

Max length70
Median length57
Mean length34.014634
Min length7

Characters and Unicode

Total characters19977169
Distinct characters59
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51 ?
Unique (%)< 0.1%

Sample

1st rowOffices of Real Estate Agents and Brokers
2nd rowDrinking Places (Alcoholic Beverages)
3rd rowCar Washes
4th rowCommercial Lithographic Printing
5th rowAutomotive Body, Paint, and Interior Repair and Maintenance

Common Values

ValueCountFrequency (%)
Full-Service Restaurants 30672
 
5.2%
Limited-Service Restaurants 20473
 
3.5%
Offices of Dentists 11244
 
1.9%
General Freight Trucking, Long Distance, Truckload 9798
 
1.7%
Hotels (except Casino Hotels) and Motels 9453
 
1.6%
Fitness and Recreational Sports Centers 9149
 
1.6%
Offices of Physicians (except Mental Health Specialists) 8618
 
1.5%
Landscaping Services 8461
 
1.4%
General Automotive Repair 8363
 
1.4%
All Other Specialty Trade Contractors 8341
 
1.4%
Other values (1210) 462739
78.7%

Length

2023-03-20T18:47:25.627678image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
and 213787
 
8.7%
other 101785
 
4.1%
services 97409
 
4.0%
stores 63473
 
2.6%
of 55282
 
2.2%
restaurants 51145
 
2.1%
offices 49218
 
2.0%
except 48580
 
2.0%
all 46729
 
1.9%
contractors 44293
 
1.8%
Other values (1488) 1693893
68.7%

Most occurring characters

ValueCountFrequency (%)
e 2120751
 
10.6%
1878283
 
9.4%
r 1379836
 
6.9%
a 1323156
 
6.6%
t 1321203
 
6.6%
n 1310268
 
6.6%
i 1308803
 
6.6%
s 1140870
 
5.7%
o 1013258
 
5.1%
c 833558
 
4.2%
Other values (49) 6347183
31.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15581319
78.0%
Uppercase Letter 2198173
 
11.0%
Space Separator 1878283
 
9.4%
Other Punctuation 136706
 
0.7%
Dash Punctuation 73925
 
0.4%
Open Punctuation 55555
 
0.3%
Close Punctuation 53208
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2120751
13.6%
r 1379836
8.9%
a 1323156
8.5%
t 1321203
8.5%
n 1310268
8.4%
i 1308803
8.4%
s 1140870
 
7.3%
o 1013258
 
6.5%
c 833558
 
5.3%
l 820516
 
5.3%
Other values (16) 3009100
19.3%
Uppercase Letter
ValueCountFrequency (%)
S 378889
17.2%
C 238286
10.8%
O 176894
 
8.0%
M 174836
 
8.0%
A 142700
 
6.5%
R 137357
 
6.2%
P 134828
 
6.1%
F 110242
 
5.0%
L 89686
 
4.1%
T 85211
 
3.9%
Other values (15) 529244
24.1%
Other Punctuation
ValueCountFrequency (%)
, 127879
93.5%
' 8213
 
6.0%
¿ 578
 
0.4%
/ 36
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1878283
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 73925
100.0%
Open Punctuation
ValueCountFrequency (%)
( 55555
100.0%
Close Punctuation
ValueCountFrequency (%)
) 53208
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17779492
89.0%
Common 2197677
 
11.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2120751
11.9%
r 1379836
 
7.8%
a 1323156
 
7.4%
t 1321203
 
7.4%
n 1310268
 
7.4%
i 1308803
 
7.4%
s 1140870
 
6.4%
o 1013258
 
5.7%
c 833558
 
4.7%
l 820516
 
4.6%
Other values (41) 5207273
29.3%
Common
ValueCountFrequency (%)
1878283
85.5%
, 127879
 
5.8%
- 73925
 
3.4%
( 55555
 
2.5%
) 53208
 
2.4%
' 8213
 
0.4%
¿ 578
 
< 0.1%
/ 36
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19976591
> 99.9%
None 578
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2120751
 
10.6%
1878283
 
9.4%
r 1379836
 
6.9%
a 1323156
 
6.6%
t 1321203
 
6.6%
n 1310268
 
6.6%
i 1308803
 
6.6%
s 1140870
 
5.7%
o 1013258
 
5.1%
c 833558
 
4.2%
Other values (48) 6346605
31.8%
None
ValueCountFrequency (%)
¿ 578
100.0%

FranchiseCode
Categorical

HIGH CARDINALITY  MISSING 

Distinct5444
Distinct (%)10.2%
Missing534676
Missing (%)90.9%
Memory size4.5 MiB
13680
 
650
78760
 
649
10245
 
407
11965
 
393
13913
 
351
Other values (5439)
50927 

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters266885
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1816 ?
Unique (%)3.4%

Sample

1st row03512
2nd row06560
3rd row79140
4th row04154
5th row70552

Common Values

ValueCountFrequency (%)
13680 650
 
0.1%
78760 649
 
0.1%
10245 407
 
0.1%
11965 393
 
0.1%
13913 351
 
0.1%
13593 322
 
0.1%
12645 306
 
0.1%
13058 299
 
0.1%
S1639 297
 
0.1%
S0122 256
 
< 0.1%
Other values (5434) 49447
 
8.4%
(Missing) 534676
90.9%

Length

2023-03-20T18:47:25.771268image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
13680 650
 
1.2%
78760 649
 
1.2%
10245 407
 
0.8%
11965 393
 
0.7%
13913 351
 
0.7%
13593 322
 
0.6%
12645 306
 
0.6%
13058 299
 
0.6%
s1639 298
 
0.6%
s0122 257
 
0.5%
Other values (5372) 49445
92.6%

Most occurring characters

ValueCountFrequency (%)
1 54931
20.6%
0 32727
12.3%
3 25257
9.5%
2 25146
9.4%
5 21962
 
8.2%
4 19836
 
7.4%
S 18548
 
6.9%
9 17809
 
6.7%
6 17231
 
6.5%
7 17216
 
6.5%
Other values (2) 16222
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 248258
93.0%
Uppercase Letter 18548
 
6.9%
Lowercase Letter 79
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 54931
22.1%
0 32727
13.2%
3 25257
10.2%
2 25146
10.1%
5 21962
 
8.8%
4 19836
 
8.0%
9 17809
 
7.2%
6 17231
 
6.9%
7 17216
 
6.9%
8 16143
 
6.5%
Uppercase Letter
ValueCountFrequency (%)
S 18548
100.0%
Lowercase Letter
ValueCountFrequency (%)
s 79
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 248258
93.0%
Latin 18627
 
7.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 54931
22.1%
0 32727
13.2%
3 25257
10.2%
2 25146
10.1%
5 21962
 
8.8%
4 19836
 
8.0%
9 17809
 
7.2%
6 17231
 
6.9%
7 17216
 
6.9%
8 16143
 
6.5%
Latin
ValueCountFrequency (%)
S 18548
99.6%
s 79
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 266885
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 54931
20.6%
0 32727
12.3%
3 25257
9.5%
2 25146
9.4%
5 21962
 
8.2%
4 19836
 
7.4%
S 18548
 
6.9%
9 17809
 
6.7%
6 17231
 
6.5%
7 17216
 
6.5%
Other values (2) 16222
 
6.1%

FranchiseName
Categorical

HIGH CARDINALITY  MISSING 

Distinct4768
Distinct (%)9.0%
Missing534790
Missing (%)90.9%
Memory size4.5 MiB
SUBWAY
 
650
SUBWAY SANDWICH SHOP
 
649
JIMMY JOHN'S
 
593
ANYTIME FITNESS
 
571
THE UPS STORE
 
432
Other values (4763)
50368 

Length

Max length30
Median length23
Mean length16.085857
Min length3

Characters and Unicode

Total characters856781
Distinct characters86
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1478 ?
Unique (%)2.8%

Sample

1st rowALPHAGRAPHICS, PRINTSHOPS OF T
2nd rowAUNTIE ANN'S (SOFT PRETZELS)
3rd rowSUPER 8 MOTEL
4th rowAMERIPRISE FINANCIAL SERVICES
5th rowRODEWAY INNS

Common Values

ValueCountFrequency (%)
SUBWAY 650
 
0.1%
SUBWAY SANDWICH SHOP 649
 
0.1%
JIMMY JOHN'S 593
 
0.1%
ANYTIME FITNESS 571
 
0.1%
THE UPS STORE 432
 
0.1%
FIREHOUSE SUBS 363
 
0.1%
DAYS INN 327
 
0.1%
SPORT CLIPS 322
 
0.1%
MASSAGE ENVY 312
 
0.1%
ORANGE THEORY FITNESS 306
 
0.1%
Other values (4758) 48738
 
8.3%
(Missing) 534790
90.9%

Length

2023-03-20T18:47:25.894936image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
inn 2851
 
2.1%
the 2804
 
2.1%
2672
 
2.0%
pizza 2060
 
1.5%
fitness 1839
 
1.4%
subway 1596
 
1.2%
suites 1152
 
0.9%
shop 879
 
0.7%
john's 871
 
0.7%
anytime 860
 
0.6%
Other values (4597) 116304
86.9%

Most occurring characters

ValueCountFrequency (%)
80516
 
9.4%
E 51306
 
6.0%
S 49245
 
5.7%
A 42850
 
5.0%
I 39561
 
4.6%
T 35445
 
4.1%
N 34652
 
4.0%
R 34419
 
4.0%
O 32337
 
3.8%
e 28794
 
3.4%
Other values (76) 427656
49.9%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 527563
61.6%
Lowercase Letter 225274
26.3%
Space Separator 80851
 
9.4%
Other Punctuation 14287
 
1.7%
Decimal Number 3640
 
0.4%
Dash Punctuation 2651
 
0.3%
Open Punctuation 1525
 
0.2%
Close Punctuation 847
 
0.1%
Math Symbol 133
 
< 0.1%
Other Symbol 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 28794
12.8%
a 20307
 
9.0%
r 19462
 
8.6%
i 18668
 
8.3%
o 18616
 
8.3%
n 17930
 
8.0%
t 15067
 
6.7%
s 14487
 
6.4%
l 10386
 
4.6%
u 8246
 
3.7%
Other values (18) 53311
23.7%
Uppercase Letter
ValueCountFrequency (%)
E 51306
 
9.7%
S 49245
 
9.3%
A 42850
 
8.1%
I 39561
 
7.5%
T 35445
 
6.7%
N 34652
 
6.6%
R 34419
 
6.5%
O 32337
 
6.1%
C 25876
 
4.9%
L 20320
 
3.9%
Other values (17) 161552
30.6%
Other Punctuation
ValueCountFrequency (%)
' 7020
49.1%
/ 2232
 
15.6%
& 2156
 
15.1%
. 1906
 
13.3%
, 772
 
5.4%
! 135
 
0.9%
¿ 30
 
0.2%
? 23
 
0.2%
# 8
 
0.1%
% 3
 
< 0.1%
Other values (2) 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
8 747
20.5%
1 640
17.6%
6 610
16.8%
0 452
12.4%
4 281
 
7.7%
5 232
 
6.4%
2 209
 
5.7%
3 205
 
5.6%
9 170
 
4.7%
7 94
 
2.6%
Space Separator
ValueCountFrequency (%)
80516
99.6%
  335
 
0.4%
Math Symbol
ValueCountFrequency (%)
+ 127
95.5%
= 6
 
4.5%
Dash Punctuation
ValueCountFrequency (%)
- 2651
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1525
100.0%
Close Punctuation
ValueCountFrequency (%)
) 847
100.0%
Other Symbol
ValueCountFrequency (%)
® 8
100.0%
Other Number
ValueCountFrequency (%)
² 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 752837
87.9%
Common 103944
 
12.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 51306
 
6.8%
S 49245
 
6.5%
A 42850
 
5.7%
I 39561
 
5.3%
T 35445
 
4.7%
N 34652
 
4.6%
R 34419
 
4.6%
O 32337
 
4.3%
e 28794
 
3.8%
C 25876
 
3.4%
Other values (45) 378352
50.3%
Common
ValueCountFrequency (%)
80516
77.5%
' 7020
 
6.8%
- 2651
 
2.6%
/ 2232
 
2.1%
& 2156
 
2.1%
. 1906
 
1.8%
( 1525
 
1.5%
) 847
 
0.8%
, 772
 
0.7%
8 747
 
0.7%
Other values (21) 3572
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 856378
> 99.9%
None 403
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
80516
 
9.4%
E 51306
 
6.0%
S 49245
 
5.8%
A 42850
 
5.0%
I 39561
 
4.6%
T 35445
 
4.1%
N 34652
 
4.0%
R 34419
 
4.0%
O 32337
 
3.8%
e 28794
 
3.4%
Other values (68) 427253
49.9%
None
ValueCountFrequency (%)
  335
83.1%
¿ 30
 
7.4%
é 12
 
3.0%
É 9
 
2.2%
® 8
 
2.0%
ö 6
 
1.5%
² 2
 
0.5%
· 1
 
0.2%

ProjectCounty
Categorical

Distinct1889
Distinct (%)0.3%
Missing2
Missing (%)< 0.1%
Memory size4.5 MiB
LOS ANGELES
 
22751
ORANGE
 
11133
COOK
 
8888
MARICOPA
 
8474
HARRIS
 
8340
Other values (1884)
528465 

Length

Max length25
Median length20
Mean length7.511607
Min length3

Characters and Unicode

Total characters4417208
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique54 ?
Unique (%)< 0.1%

Sample

1st rowDENVER
2nd rowSACRAMENTO
3rd rowYOLO
4th rowFAIRFIELD
5th rowELKHART

Common Values

ValueCountFrequency (%)
LOS ANGELES 22751
 
3.9%
ORANGE 11133
 
1.9%
COOK 8888
 
1.5%
MARICOPA 8474
 
1.4%
HARRIS 8340
 
1.4%
SAN DIEGO 6883
 
1.2%
MONTGOMERY 6784
 
1.2%
MIDDLESEX 6654
 
1.1%
FRANKLIN 6222
 
1.1%
JEFFERSON 6099
 
1.0%
Other values (1879) 495823
84.3%

Length

2023-03-20T18:47:26.023619image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
los 22764
 
3.3%
angeles 22751
 
3.3%
san 16887
 
2.5%
orange 11133
 
1.6%
cook 8888
 
1.3%
maricopa 8474
 
1.2%
harris 8340
 
1.2%
new 8154
 
1.2%
lake 7858
 
1.2%
diego 6883
 
1.0%
Other values (1907) 559221
82.1%

Most occurring characters

ValueCountFrequency (%)
A 513339
11.6%
E 434906
 
9.8%
N 386797
 
8.8%
O 364196
 
8.2%
R 296063
 
6.7%
S 290357
 
6.6%
L 290334
 
6.6%
I 241762
 
5.5%
T 180178
 
4.1%
M 150406
 
3.4%
Other values (22) 1268870
28.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4319180
97.8%
Space Separator 93302
 
2.1%
Dash Punctuation 4711
 
0.1%
Lowercase Letter 15
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 513339
11.9%
E 434906
 
10.1%
N 386797
 
9.0%
O 364196
 
8.4%
R 296063
 
6.9%
S 290357
 
6.7%
L 290334
 
6.7%
I 241762
 
5.6%
T 180178
 
4.2%
M 150406
 
3.5%
Other values (16) 1170842
27.1%
Lowercase Letter
ValueCountFrequency (%)
a 6
40.0%
i 3
20.0%
p 3
20.0%
n 3
20.0%
Space Separator
ValueCountFrequency (%)
93302
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4711
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4319195
97.8%
Common 98013
 
2.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 513339
11.9%
E 434906
 
10.1%
N 386797
 
9.0%
O 364196
 
8.4%
R 296063
 
6.9%
S 290357
 
6.7%
L 290334
 
6.7%
I 241762
 
5.6%
T 180178
 
4.2%
M 150406
 
3.5%
Other values (20) 1170857
27.1%
Common
ValueCountFrequency (%)
93302
95.2%
- 4711
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4417208
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 513339
11.6%
E 434906
 
9.8%
N 386797
 
8.8%
O 364196
 
8.2%
R 296063
 
6.7%
S 290357
 
6.6%
L 290334
 
6.6%
I 241762
 
5.5%
T 180178
 
4.1%
M 150406
 
3.4%
Other values (22) 1268870
28.7%

ProjectState
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct58
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
CA
73437 
TX
44815 
NY
37465 
OH
 
33473
FL
 
28650
Other values (53)
370213 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters1176106
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowCO
2nd rowCA
3rd rowCA
4th rowCT
5th rowIN

Common Values

ValueCountFrequency (%)
CA 73437
 
12.5%
TX 44815
 
7.6%
NY 37465
 
6.4%
OH 33473
 
5.7%
FL 28650
 
4.9%
MI 22983
 
3.9%
MA 20765
 
3.5%
IL 19460
 
3.3%
PA 18831
 
3.2%
MN 16980
 
2.9%
Other values (48) 271194
46.1%

Length

2023-03-20T18:47:26.131330image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca 73437
 
12.5%
tx 44815
 
7.6%
ny 37465
 
6.4%
oh 33473
 
5.7%
fl 28650
 
4.9%
mi 22983
 
3.9%
ma 20765
 
3.5%
il 19460
 
3.3%
pa 18831
 
3.2%
mn 16980
 
2.9%
Other values (48) 271194
46.1%

Most occurring characters

ValueCountFrequency (%)
A 183962
15.6%
N 119602
10.2%
C 111818
 
9.5%
M 94427
 
8.0%
I 87963
 
7.5%
T 74075
 
6.3%
O 72675
 
6.2%
L 56877
 
4.8%
X 44815
 
3.8%
Y 43966
 
3.7%
Other values (14) 285926
24.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1176106
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 183962
15.6%
N 119602
10.2%
C 111818
 
9.5%
M 94427
 
8.0%
I 87963
 
7.5%
T 74075
 
6.3%
O 72675
 
6.2%
L 56877
 
4.8%
X 44815
 
3.8%
Y 43966
 
3.7%
Other values (14) 285926
24.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 1176106
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 183962
15.6%
N 119602
10.2%
C 111818
 
9.5%
M 94427
 
8.0%
I 87963
 
7.5%
T 74075
 
6.3%
O 72675
 
6.2%
L 56877
 
4.8%
X 44815
 
3.8%
Y 43966
 
3.7%
Other values (14) 285926
24.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1176106
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 183962
15.6%
N 119602
10.2%
C 111818
 
9.5%
M 94427
 
8.0%
I 87963
 
7.5%
T 74075
 
6.3%
O 72675
 
6.2%
L 56877
 
4.8%
X 44815
 
3.8%
Y 43966
 
3.7%
Other values (14) 285926
24.3%

SBADistrictOffice
Categorical

HIGH CARDINALITY  HIGH CORRELATION 

Distinct74
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
LOS ANGELES DISTRICT OFFICE 
 
24831
MICHIGAN DISTRICT OFFICE 
 
22982
SOUTH FLORIDA DISTRICT OFFICE 
 
20482
ILLINOIS DISTRICT OFFICE 
 
19458
MASSACHUSETTS DISTRICT OFFICE 
 
18941
Other values (69)
481359 

Length

Max length40
Median length34
Mean length26.511124
Min length21

Characters and Unicode

Total characters15589946
Distinct characters29
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCOLORADO DISTRICT OFFICE 
2nd rowSACRAMENTO DISTRICT OFFICE 
3rd rowSACRAMENTO DISTRICT OFFICE 
4th rowCONNECTICUT DISTRICT OFFICE 
5th rowINDIANA DISTRICT OFFICE 

Common Values

ValueCountFrequency (%)
LOS ANGELES DISTRICT OFFICE  24831
 
4.2%
MICHIGAN DISTRICT OFFICE  22982
 
3.9%
SOUTH FLORIDA DISTRICT OFFICE  20482
 
3.5%
ILLINOIS DISTRICT OFFICE  19458
 
3.3%
MASSACHUSETTS DISTRICT OFFICE  18941
 
3.2%
DALLAS / FT WORTH DISTRICT OFFICE  17796
 
3.0%
NEW YORK DISTRICT OFFICE  17516
 
3.0%
MINNESOTA DISTRICT OFFICE  16980
 
2.9%
COLUMBUS DISTRICT OFFICE  16979
 
2.9%
CLEVELAND DISTRICT OFFICE  16495
 
2.8%
Other values (64) 395593
67.3%

Length

2023-03-20T18:47:26.244036image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
office 588053
29.1%
district 567133
28.1%
new 42164
 
2.1%
san 29532
 
1.5%
florida 28650
 
1.4%
south 26615
 
1.3%
los 24831
 
1.2%
angeles 24831
 
1.2%
michigan 22982
 
1.1%
north 21896
 
1.1%
Other values (86) 642812
31.8%

Most occurring characters

ValueCountFrequency (%)
I 2168720
13.9%
T 1440496
9.2%
1431446
9.2%
C 1392928
8.9%
F 1255506
 
8.1%
S 1069339
 
6.9%
O 1067809
 
6.8%
E 981484
 
6.3%
R 863488
 
5.5%
D 729117
 
4.7%
Other values (19) 3189613
20.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 13547678
86.9%
Space Separator 2019499
 
13.0%
Other Punctuation 22769
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 2168720
16.0%
T 1440496
10.6%
C 1392928
10.3%
F 1255506
9.3%
S 1069339
7.9%
O 1067809
7.9%
E 981484
7.2%
R 863488
 
6.4%
D 729117
 
5.4%
A 664280
 
4.9%
Other values (15) 1914511
14.1%
Space Separator
ValueCountFrequency (%)
1431446
70.9%
  588053
29.1%
Other Punctuation
ValueCountFrequency (%)
/ 17796
78.2%
. 4973
 
21.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 13547678
86.9%
Common 2042268
 
13.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 2168720
16.0%
T 1440496
10.6%
C 1392928
10.3%
F 1255506
9.3%
S 1069339
7.9%
O 1067809
7.9%
E 981484
7.2%
R 863488
 
6.4%
D 729117
 
5.4%
A 664280
 
4.9%
Other values (15) 1914511
14.1%
Common
ValueCountFrequency (%)
1431446
70.1%
  588053
28.8%
/ 17796
 
0.9%
. 4973
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15001893
96.2%
None 588053
 
3.8%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 2168720
14.5%
T 1440496
9.6%
1431446
9.5%
C 1392928
9.3%
F 1255506
8.4%
S 1069339
7.1%
O 1067809
7.1%
E 981484
 
6.5%
R 863488
 
5.8%
D 729117
 
4.9%
Other values (18) 2601560
17.3%
None
ValueCountFrequency (%)
  588053
100.0%

CongressionalDistrict
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct54
Distinct (%)< 0.1%
Missing35
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean10.241255
Minimum0
Maximum53
Zeros19793
Zeros (%)3.4%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:26.373744image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q13
median6
Q313
95-th percentile35
Maximum53
Range53
Interquartile range (IQR)10

Descriptive statistics

Standard deviation11.014161
Coefficient of variation (CV)1.0754699
Kurtosis2.9598985
Mean10.241255
Median Absolute Deviation (MAD)4
Skewness1.7992169
Sum6022042
Variance121.31173
MonotonicityNot monotonic
2023-03-20T18:47:26.527306image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 58261
 
9.9%
2 55536
 
9.4%
3 49998
 
8.5%
4 43294
 
7.4%
6 36697
 
6.2%
5 36671
 
6.2%
7 35706
 
6.1%
8 26399
 
4.5%
9 22340
 
3.8%
0 19793
 
3.4%
Other values (44) 203323
34.6%
ValueCountFrequency (%)
0 19793
 
3.4%
1 58261
9.9%
2 55536
9.4%
3 49998
8.5%
4 43294
7.4%
5 36671
6.2%
6 36697
6.2%
7 35706
6.1%
8 26399
4.5%
9 22340
 
3.8%
ValueCountFrequency (%)
53 1593
0.3%
52 1977
0.3%
51 974
0.2%
50 1051
0.2%
49 1977
0.3%
48 1703
0.3%
47 1539
0.3%
46 1506
0.3%
45 2374
0.4%
44 1241
0.2%

BusinessType
Categorical

Distinct3
Distinct (%)< 0.1%
Missing5
Missing (%)< 0.1%
Memory size4.5 MiB
CORPORATION
516179 
INDIVIDUAL
61455 
PARTNERSHIP
 
10414

Length

Max length11
Median length11
Mean length10.895493
Min length10

Characters and Unicode

Total characters6407073
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCORPORATION
2nd rowCORPORATION
3rd rowCORPORATION
4th rowCORPORATION
5th rowINDIVIDUAL

Common Values

ValueCountFrequency (%)
CORPORATION 516179
87.8%
INDIVIDUAL 61455
 
10.5%
PARTNERSHIP 10414
 
1.8%
(Missing) 5
 
< 0.1%

Length

2023-03-20T18:47:26.661980image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-20T18:47:26.796587image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
corporation 516179
87.8%
individual 61455
 
10.5%
partnership 10414
 
1.8%

Most occurring characters

ValueCountFrequency (%)
O 1548537
24.2%
R 1053186
16.4%
I 710958
11.1%
A 588048
 
9.2%
N 588048
 
9.2%
P 537007
 
8.4%
T 526593
 
8.2%
C 516179
 
8.1%
D 122910
 
1.9%
V 61455
 
1.0%
Other values (5) 154152
 
2.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 6407073
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 1548537
24.2%
R 1053186
16.4%
I 710958
11.1%
A 588048
 
9.2%
N 588048
 
9.2%
P 537007
 
8.4%
T 526593
 
8.2%
C 516179
 
8.1%
D 122910
 
1.9%
V 61455
 
1.0%
Other values (5) 154152
 
2.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 6407073
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 1548537
24.2%
R 1053186
16.4%
I 710958
11.1%
A 588048
 
9.2%
N 588048
 
9.2%
P 537007
 
8.4%
T 526593
 
8.2%
C 516179
 
8.1%
D 122910
 
1.9%
V 61455
 
1.0%
Other values (5) 154152
 
2.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6407073
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 1548537
24.2%
R 1053186
16.4%
I 710958
11.1%
A 588048
 
9.2%
N 588048
 
9.2%
P 537007
 
8.4%
T 526593
 
8.2%
C 516179
 
8.1%
D 122910
 
1.9%
V 61455
 
1.0%
Other values (5) 154152
 
2.4%

LoanStatus
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
EXEMPT
262008 
PIF
222841 
CANCLD
66739 
CHGOFF
 
20670
COMMIT
 
15795

Length

Max length6
Median length6
Mean length4.8631586
Min length3

Characters and Unicode

Total characters2859795
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPIF
2nd rowPIF
3rd rowCHGOFF
4th rowEXEMPT
5th rowPIF

Common Values

ValueCountFrequency (%)
EXEMPT 262008
44.6%
PIF 222841
37.9%
CANCLD 66739
 
11.3%
CHGOFF 20670
 
3.5%
COMMIT 15795
 
2.7%

Length

2023-03-20T18:47:26.915271image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-20T18:47:27.066892image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
exempt 262008
44.6%
pif 222841
37.9%
cancld 66739
 
11.3%
chgoff 20670
 
3.5%
commit 15795
 
2.7%

Most occurring characters

ValueCountFrequency (%)
E 524016
18.3%
P 484849
17.0%
M 293598
10.3%
T 277803
9.7%
F 264181
9.2%
X 262008
9.2%
I 238636
8.3%
C 169943
 
5.9%
A 66739
 
2.3%
N 66739
 
2.3%
Other values (5) 211283
7.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2859795
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 524016
18.3%
P 484849
17.0%
M 293598
10.3%
T 277803
9.7%
F 264181
9.2%
X 262008
9.2%
I 238636
8.3%
C 169943
 
5.9%
A 66739
 
2.3%
N 66739
 
2.3%
Other values (5) 211283
7.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 2859795
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 524016
18.3%
P 484849
17.0%
M 293598
10.3%
T 277803
9.7%
F 264181
9.2%
X 262008
9.2%
I 238636
8.3%
C 169943
 
5.9%
A 66739
 
2.3%
N 66739
 
2.3%
Other values (5) 211283
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2859795
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 524016
18.3%
P 484849
17.0%
M 293598
10.3%
T 277803
9.7%
F 264181
9.2%
X 262008
9.2%
I 238636
8.3%
C 169943
 
5.9%
A 66739
 
2.3%
N 66739
 
2.3%
Other values (5) 211283
7.4%

PaidInFullDate
Categorical

HIGH CARDINALITY  MISSING 

Distinct130
Distinct (%)0.1%
Missing365212
Missing (%)62.1%
Memory size4.5 MiB
31-10-2019
 
3819
31-12-2019
 
3779
30-04-2019
 
3604
31-12-2017
 
3516
31-08-2019
 
3484
Other values (125)
204639 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2228410
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row31-10-2016
2nd row31-03-2019
3rd row28-02-2013
4th row31-10-2019
5th row31-03-2017

Common Values

ValueCountFrequency (%)
31-10-2019 3819
 
0.6%
31-12-2019 3779
 
0.6%
30-04-2019 3604
 
0.6%
31-12-2017 3516
 
0.6%
31-08-2019 3484
 
0.6%
30-09-2019 3444
 
0.6%
31-08-2018 3440
 
0.6%
31-05-2019 3430
 
0.6%
31-03-2019 3413
 
0.6%
31-10-2018 3357
 
0.6%
Other values (120) 187555
31.9%
(Missing) 365212
62.1%

Length

2023-03-20T18:47:27.172609image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
31-10-2019 3819
 
1.7%
31-12-2019 3779
 
1.7%
30-04-2019 3604
 
1.6%
31-12-2017 3516
 
1.6%
31-08-2019 3484
 
1.6%
30-09-2019 3444
 
1.5%
31-08-2018 3440
 
1.5%
31-05-2019 3430
 
1.5%
31-03-2019 3413
 
1.5%
31-10-2018 3357
 
1.5%
Other values (120) 187555
84.2%

Most occurring characters

ValueCountFrequency (%)
0 501929
22.5%
- 445682
20.0%
1 427536
19.2%
2 306218
13.7%
3 236980
10.6%
8 70198
 
3.2%
9 63765
 
2.9%
7 52773
 
2.4%
6 46109
 
2.1%
5 41707
 
1.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1782728
80.0%
Dash Punctuation 445682
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 501929
28.2%
1 427536
24.0%
2 306218
17.2%
3 236980
13.3%
8 70198
 
3.9%
9 63765
 
3.6%
7 52773
 
3.0%
6 46109
 
2.6%
5 41707
 
2.3%
4 35513
 
2.0%
Dash Punctuation
ValueCountFrequency (%)
- 445682
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2228410
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 501929
22.5%
- 445682
20.0%
1 427536
19.2%
2 306218
13.7%
3 236980
10.6%
8 70198
 
3.2%
9 63765
 
2.9%
7 52773
 
2.4%
6 46109
 
2.1%
5 41707
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2228410
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 501929
22.5%
- 445682
20.0%
1 427536
19.2%
2 306218
13.7%
3 236980
10.6%
8 70198
 
3.2%
9 63765
 
2.9%
7 52773
 
2.4%
6 46109
 
2.1%
5 41707
 
1.9%

ChargeOffDate
Categorical

HIGH CARDINALITY  MISSING 

Distinct2184
Distinct (%)10.6%
Missing567384
Missing (%)96.5%
Memory size4.5 MiB
22-02-2019
 
70
24-03-2020
 
69
07-02-2019
 
59
03-03-2020
 
56
12-02-2019
 
56
Other values (2179)
20359 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters206690
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique243 ?
Unique (%)1.2%

Sample

1st row24-08-2015
2nd row16-05-2016
3rd row12-02-2018
4th row01-03-2011
5th row03-12-2012

Common Values

ValueCountFrequency (%)
22-02-2019 70
 
< 0.1%
24-03-2020 69
 
< 0.1%
07-02-2019 59
 
< 0.1%
03-03-2020 56
 
< 0.1%
12-02-2019 56
 
< 0.1%
15-03-2018 51
 
< 0.1%
16-03-2017 49
 
< 0.1%
02-01-2018 48
 
< 0.1%
14-02-2019 47
 
< 0.1%
12-03-2018 47
 
< 0.1%
Other values (2174) 20117
 
3.4%
(Missing) 567384
96.5%

Length

2023-03-20T18:47:27.532654image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
22-02-2019 70
 
0.3%
24-03-2020 69
 
0.3%
07-02-2019 59
 
0.3%
03-03-2020 56
 
0.3%
12-02-2019 56
 
0.3%
15-03-2018 51
 
0.2%
16-03-2017 49
 
0.2%
02-01-2018 48
 
0.2%
14-02-2019 47
 
0.2%
12-03-2018 47
 
0.2%
Other values (2174) 20117
97.3%

Most occurring characters

ValueCountFrequency (%)
0 48909
23.7%
- 41338
20.0%
2 35839
17.3%
1 34700
16.8%
8 7727
 
3.7%
9 7709
 
3.7%
7 6581
 
3.2%
6 6516
 
3.2%
3 6250
 
3.0%
5 5784
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 165352
80.0%
Dash Punctuation 41338
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 48909
29.6%
2 35839
21.7%
1 34700
21.0%
8 7727
 
4.7%
9 7709
 
4.7%
7 6581
 
4.0%
6 6516
 
3.9%
3 6250
 
3.8%
5 5784
 
3.5%
4 5337
 
3.2%
Dash Punctuation
ValueCountFrequency (%)
- 41338
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 206690
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 48909
23.7%
- 41338
20.0%
2 35839
17.3%
1 34700
16.8%
8 7727
 
3.7%
9 7709
 
3.7%
7 6581
 
3.2%
6 6516
 
3.2%
3 6250
 
3.0%
5 5784
 
2.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 206690
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 48909
23.7%
- 41338
20.0%
2 35839
17.3%
1 34700
16.8%
8 7727
 
3.7%
9 7709
 
3.7%
7 6581
 
3.2%
6 6516
 
3.2%
3 6250
 
3.0%
5 5784
 
2.8%

GrossChargeOffAmount
Real number (ℝ)

SKEWED  ZEROS 

Distinct17960
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4305.2546
Minimum0
Maximum4706180
Zeros567384
Zeros (%)96.5%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:27.660278image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4706180
Range4706180
Interquartile range (IQR)0

Descriptive statistics

Standard deviation50350.344
Coefficient of variation (CV)11.695091
Kurtosis1388.0567
Mean4305.2546
Median Absolute Deviation (MAD)0
Skewness29.433801
Sum2.5317179 × 109
Variance2.5351571 × 109
MonotonicityNot monotonic
2023-03-20T18:47:27.817148image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 567384
96.5%
25000 181
 
< 0.1%
50000 99
 
< 0.1%
10000 92
 
< 0.1%
100000 59
 
< 0.1%
20000 53
 
< 0.1%
15000 42
 
< 0.1%
24999 31
 
< 0.1%
5000 29
 
< 0.1%
350000 27
 
< 0.1%
Other values (17950) 20056
 
3.4%
ValueCountFrequency (%)
0 567384
96.5%
11 1
 
< 0.1%
60 1
 
< 0.1%
98 1
 
< 0.1%
109 1
 
< 0.1%
179 1
 
< 0.1%
184 1
 
< 0.1%
223 1
 
< 0.1%
228 1
 
< 0.1%
250 1
 
< 0.1%
ValueCountFrequency (%)
4706180 1
< 0.1%
4334490 1
< 0.1%
4254436 1
< 0.1%
4084959 1
< 0.1%
4002876 1
< 0.1%
3867035 1
< 0.1%
3730328 1
< 0.1%
3498136 1
< 0.1%
3418147 1
< 0.1%
3403551 1
< 0.1%

RevolverStatus
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.5 MiB
0
402998 
1
185055 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters588053
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 402998
68.5%
1 185055
31.5%

Length

2023-03-20T18:47:27.947832image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-03-20T18:47:28.047566image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
ValueCountFrequency (%)
0 402998
68.5%
1 185055
31.5%

Most occurring characters

ValueCountFrequency (%)
0 402998
68.5%
1 185055
31.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 588053
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 402998
68.5%
1 185055
31.5%

Most occurring scripts

ValueCountFrequency (%)
Common 588053
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 402998
68.5%
1 185055
31.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 588053
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 402998
68.5%
1 185055
31.5%

JobsSupported
Real number (ℝ)

Distinct312
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10.720804
Minimum0
Maximum2150
Zeros93989
Zeros (%)16.0%
Negative0
Negative (%)0.0%
Memory size4.5 MiB
2023-03-20T18:47:28.158269image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q12
median5
Q311
95-th percentile41
Maximum2150
Range2150
Interquartile range (IQR)9

Descriptive statistics

Standard deviation20.425223
Coefficient of variation (CV)1.9051951
Kurtosis290.78397
Mean10.720804
Median Absolute Deviation (MAD)4
Skewness8.1718161
Sum6304401
Variance417.18973
MonotonicityNot monotonic
2023-03-20T18:47:28.281933image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 93989
16.0%
2 65142
 
11.1%
1 48415
 
8.2%
4 43341
 
7.4%
3 42502
 
7.2%
5 33952
 
5.8%
6 29927
 
5.1%
10 23149
 
3.9%
8 20835
 
3.5%
7 19374
 
3.3%
Other values (302) 167427
28.5%
ValueCountFrequency (%)
0 93989
16.0%
1 48415
8.2%
2 65142
11.1%
3 42502
7.2%
4 43341
7.4%
5 33952
 
5.8%
6 29927
 
5.1%
7 19374
 
3.3%
8 20835
 
3.5%
9 11759
 
2.0%
ValueCountFrequency (%)
2150 1
< 0.1%
1155 1
< 0.1%
1011 1
< 0.1%
850 1
< 0.1%
750 1
< 0.1%
725 1
< 0.1%
600 1
< 0.1%
570 1
< 0.1%
480 1
< 0.1%
460 2
< 0.1%

Interactions

2023-03-20T18:47:04.862502image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:39.500580image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:42.777030image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:45.302414image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:47.998265image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:50.543199image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:53.501552image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:56.288134image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:59.435799image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:02.103773image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:05.101835image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:39.866602image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:43.035373image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:45.585712image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:48.265159image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:50.834422image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:53.793777image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:56.566395image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:59.699094image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:02.361052image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:05.361141image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:40.166838image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:43.291687image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:45.834048image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:48.526481image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:51.129744image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:54.066049image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:56.848636image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:59.960397image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:02.646291image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:05.627633image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:40.460049image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:43.539130image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:46.079392image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:48.765875image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:51.426976image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:54.339319image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:57.181718image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:00.221806image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:02.952471image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:05.898906image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:40.781276image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:43.807414image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:46.350663image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:49.020198image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:51.731953image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:54.632561image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:57.487898image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:00.491088image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:03.234744image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:06.159212image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:41.113360image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:44.047801image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:46.632879image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:49.261552image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:52.036139image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:54.902838image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:57.791121image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:00.764357image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:03.535913image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:06.430484image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:41.498341image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:44.300126image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:46.903157image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:49.516866image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:52.347347image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:55.184061image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:58.075361image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:01.030648image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:03.802200image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:06.767585image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:41.967088image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:44.573395image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:47.217317image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:49.797228image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:52.668488image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:55.488281image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:58.588988image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:01.305908image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:04.073487image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:07.166519image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:42.265316image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:44.821732image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:47.491598image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:50.033562image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:52.943753image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:55.790437image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:58.888254image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:01.574190image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:04.335784image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:07.458019image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:42.520643image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:45.060094image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:47.755908image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:50.278940image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:53.230984image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:56.033821image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:46:59.160500image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:01.854440image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
2023-03-20T18:47:04.600178image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/

Correlations

2023-03-20T18:47:28.410589image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
BorrZipGrossApprovalSBAGuaranteedApprovalApprovalFiscalYearInitialInterestRateTermInMonthsNaicsCodeCongressionalDistrictGrossChargeOffAmountJobsSupportedBorrStateBankStateDeliveryMethodsubpgmdescProjectStateSBADistrictOfficeBusinessTypeLoanStatusRevolverStatus
BorrZip1.0000.1020.111-0.0110.0100.115-0.0020.1350.0020.0570.9560.5660.1040.0950.9550.9510.0890.0340.115
GrossApproval0.1021.0000.9930.071-0.4130.5970.0570.054-0.0780.4370.0640.1030.1960.1950.0640.0660.0590.0530.306
SBAGuaranteedApproval0.1110.9931.0000.064-0.4080.6180.0680.049-0.0710.4360.0540.0890.1740.1740.0540.0560.0500.0410.239
ApprovalFiscalYear-0.0110.0710.0641.0000.2200.1710.0280.032-0.117-0.0510.0420.0840.1440.1400.0420.0460.0540.3520.059
InitialInterestRate0.010-0.413-0.4080.2201.000-0.112-0.0030.0170.056-0.1910.0870.1620.1540.1530.0870.0950.0640.0840.377
TermInMonths0.1150.5970.6180.171-0.1121.0000.1320.065-0.1930.1790.1260.1920.2860.2840.1270.1300.0750.1540.478
NaicsCode-0.0020.0570.0680.028-0.0030.1321.0000.019-0.0080.1250.1120.1360.1020.1020.1120.1170.1390.0340.224
CongressionalDistrict0.1350.0540.0490.0320.0170.0650.0191.0000.009-0.0020.3670.2060.0450.0390.3670.5680.0540.0160.043
GrossChargeOffAmount0.002-0.078-0.071-0.1170.056-0.193-0.0080.0091.000-0.0100.0070.0120.0200.0180.0070.0070.0080.1110.024
JobsSupported0.0570.4370.436-0.051-0.1910.1790.125-0.002-0.0101.0000.0050.0100.0140.0130.0050.0070.0070.0040.011
BorrState0.9560.0640.0540.0420.0870.1260.1120.3670.0070.0051.0000.5250.1530.1380.9980.9230.1440.0520.160
BankState0.5660.1030.0890.0840.1620.1920.1360.2060.0120.0100.5251.0000.2100.1830.5250.5150.1200.0880.388
DeliveryMethod0.1040.1960.1740.1440.1540.2860.1020.0450.0200.0140.1530.2101.0000.8890.1530.1550.0880.1210.658
subpgmdesc0.0950.1950.1740.1400.1530.2840.1020.0390.0180.0130.1380.1830.8891.0000.1380.1410.0880.1180.678
ProjectState0.9550.0640.0540.0420.0870.1270.1120.3670.0070.0050.9980.5250.1530.1381.0000.9250.1440.0520.160
SBADistrictOffice0.9510.0660.0560.0460.0950.1300.1170.5680.0070.0070.9230.5150.1550.1410.9251.0000.1530.0550.163
BusinessType0.0890.0590.0500.0540.0640.0750.1390.0540.0080.0070.1440.1200.0880.0880.1440.1531.0000.0350.032
LoanStatus0.0340.0530.0410.3520.0840.1540.0340.0160.1110.0040.0520.0880.1210.1180.0520.0550.0351.0000.087
RevolverStatus0.1150.3060.2390.0590.3770.4780.2240.0430.0240.0110.1600.3880.6580.6780.1600.1630.0320.0871.000

Missing values

2023-03-20T18:47:09.607452image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
A simple visualization of nullity by column.
2023-03-20T18:47:13.510134image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-03-20T18:47:19.286649image/svg+xmlMatplotlib v3.5.0, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

AsOfDateProgramBorrNameBorrStreetBorrCityBorrStateBorrZipBankNameBankStreetBankCityBankStateBankZipGrossApprovalSBAGuaranteedApprovalApprovalDateApprovalFiscalYearFirstDisbursementDateDeliveryMethodsubpgmdescInitialInterestRateTermInMonthsNaicsCodeNaicsDescriptionFranchiseCodeFranchiseNameProjectCountyProjectStateSBADistrictOfficeCongressionalDistrictBusinessTypeLoanStatusPaidInFullDateChargeOffDateGrossChargeOffAmountRevolverStatusJobsSupported
0202009307ACRESA PARTNERS - DENVER, INC.7979 E TUFTS AVE PKWY STE 810DENVERCO80237Zions Bank, A Division of1 S Main StSALT LAKE CITYUT8413325000012500001-10-2009201001-10-2009SBA EXPRESFA$TRK (Small Loan Express)5.2684531210.0Offices of Real Estate Agents and BrokersNaNNaNDENVERCOCOLORADO DISTRICT OFFICE1.0CORPORATIONPIF31-10-2016NaN0135
1202009307AThe Hilltop Tavern4757 Folsom BlvdSacramentoCA95819Plumas Bank336 W Main StQUINCYCA9597123350021015001-10-2009201001-10-2009PLPGuaranty6.00120722410.0Drinking Places (Alcoholic Beverages)NaNNaNSACRAMENTOCASACRAMENTO DISTRICT OFFICE6.0CORPORATIONPIF31-03-2019NaN006
2202009307ARiver City Car Wash LLC649 Harbor BlvdWest SacramentoCA95691Wells Fargo Bank, National Association101 N Philips AveSIOUX FALLSSD5710468390061551001-10-2009201001-11-2009PLPGuaranty5.25210811192.0Car WashesNaNNaNYOLOCASACRAMENTO DISTRICT OFFICE3.0CORPORATIONCHGOFFNaN24-08-2015320098027
3202009307AAlphagraphics71 Newtown Road.DanburyCT6810Union Savings Bank226 Main StDANBURYCT68101000005000001-10-2009201001-10-2009SBA EXPRESFA$TRK (Small Loan Express)5.2584323110.0Commercial Lithographic Printing03512ALPHAGRAPHICS, PRINTSHOPS OF TFAIRFIELDCTCONNECTICUT DISTRICT OFFICE5.0CORPORATIONEXEMPTNaNNaN015
4202009307AON SITE AUTOMOTIVE APPEARANCE603 WOODBRIDGE COURTMIDDLEBURYIN46540VelocitySBA, LLC9385 Haven AvenueRancho CucamongaCA91730125001125001-10-2009201001-10-2009COMM EXPRSCommunity Express7.75120811121.0Automotive Body, Paint, and Interior Repair and MaintenanceNaNNaNELKHARTININDIANA DISTRICT OFFICE3.0INDIVIDUALPIF28-02-2013NaN002
5202009307AAW ENTERPRISES106 WEST LAKE STREETCAMDENTN38320VelocitySBA, LLC9385 Haven AvenueRancho CucamongaCA9173010000900001-10-2009201001-10-2009COMM EXPRSCommunity Express7.24120453220.0Gift, Novelty, and Souvenir StoresNaNNaNBENTONTNTENNESSEE DISTRICT OFFICE7.0INDIVIDUALPIF31-10-2019NaN003
6202009307ATA SPECIALTY331 HEATHERSTONE ROADCOLUMBIASC29212VelocitySBA, LLC9385 Haven AvenueRancho CucamongaCA91730200001800001-10-2009201001-10-2009PATRIOT EXPatriot Express7.75120541890.0Other Services Related to AdvertisingNaNNaNLEXINGTONSCSOUTH CAROLINA DISTRICT OFFICE2.0INDIVIDUALPIF31-03-2017NaN004
7202009307AHansen Concrete, Inc.8656 71st Street NESPICERMN56288Lake Region Bank51 Main StNEW LONDONMN5627320000010000001-10-2009201001-12-2009SBA EXPRESFA$TRK (Small Loan Express)7.5060236220.0Commercial and Institutional Building ConstructionNaNNaNKANDIYOHIMNMINNESOTA DISTRICT OFFICE7.0CORPORATIONPIF29-02-2012NaN0112
8202009307ASHADIA'S SECRETS INC950 GLADES RD 2ND FLOORBOCA RATONFL33431VelocitySBA, LLC9385 Haven AvenueRancho CucamongaCA91730150001350001-10-2009201001-10-2009COMM EXPRSCommunity Express7.75120446199.0All Other Health and Personal Care StoresNaNNaNPALM BEACHFLSOUTH FLORIDA DISTRICT OFFICE22.0CORPORATIONPIF31-10-2019NaN001
9202009307AJT'S COFFEE AND SANDWICH SHOP1167 HARRISON STREETSAN FRANCISCOCA94103VelocitySBA, LLC9385 Haven AvenueRancho CucamongaCA91730200001800001-10-2009201001-10-2009COMM EXPRSCommunity Express7.75120722213.0Snack and Nonalcoholic Beverage BarsNaNNaNSAN FRANCISCOCASAN FRANCISCO DISTRICT OFFICE12.0CORPORATIONPIF31-10-2017NaN004
AsOfDateProgramBorrNameBorrStreetBorrCityBorrStateBorrZipBankNameBankStreetBankCityBankStateBankZipGrossApprovalSBAGuaranteedApprovalApprovalDateApprovalFiscalYearFirstDisbursementDateDeliveryMethodsubpgmdescInitialInterestRateTermInMonthsNaicsCodeNaicsDescriptionFranchiseCodeFranchiseNameProjectCountyProjectStateSBADistrictOfficeCongressionalDistrictBusinessTypeLoanStatusPaidInFullDateChargeOffDateGrossChargeOffAmountRevolverStatusJobsSupported
588043202009307AMSA Building LLC16535 Grand River AveDetroitMI48227The Huntington National Bank17 S High StCOLUMBUSOH43215.047940035955030-09-20202020NaNPLPGuaranty6.00180445310.0Beer, Wine, and Liquor StoresNaNNaNWAYNEMIMICHIGAN DISTRICT OFFICE14.0CORPORATIONCOMMITNaNNaN004
588044202009307AApponaug Animal Hospital, Inc.61 Gilbane Street a/k/a 42 BWarwickRI2886HarborOne Bank770 Oak StBROCKTONMA2301.018000013500030-09-20202020NaNPLPGuaranty4.25120541940.0Veterinary ServicesNaNNaNKENTRIRHODE ISLAND DISTRICT OFFICE2.0CORPORATIONCOMMITNaNNaN006
588045202009307ABrass and rose-Kyle LLC5401 FM 1626 #200KyleTX78640Wallis Bank6510 Railroad StWALLISTX77485.026100019575030-09-20202020NaNPLPGuaranty5.00126812990.0All Other Personal ServicesS0101Amazing Lash StudioHAYSTXSAN ANTONIO DISTRICT OFFICE35.0PARTNERSHIPCOMMITNaNNaN0052
588046202009307AGODINEZ LOGISTICS LLC7704 CLEARWATER AVE WKENNEWICKWA99336U.S. Bank, National Association425 Walnut StCINCINNATIOH45202.019000950030-09-20202020NaNSBA EXPRESFA$TRK (Small Loan Express)5.2460484121.0General Freight Trucking, Long Distance, TruckloadNaNNaNBENTONWASPOKANE BRANCH OFFICE4.0CORPORATIONCOMMITNaNNaN001
588047202009307ADazDan Foods LLC177 Elton Adelphia RoadFreeholdNJ7728Trenton Business Assistance Corporation3111 Quakerbridge RoadMercervilleNJ8619.020000015000030-09-20202020NaNCACommunity Advantage Initiative5.25120445299.0All Other Specialty Food StoresNaNNaNMONMOUTHNJNEW JERSEY DISTRICT OFFICE4.0CORPORATIONCOMMITNaNNaN0015
588048202009307ALincoln Alan Construction Grou9120 Crackle RdChagrin FallsOH44023The Huntington National Bank17 S High StCOLUMBUSOH43215.020000010000030-09-20202020NaNSBA EXPRESFA$TRK (Small Loan Express)5.50120236118.0Residential RemodelersNaNNaNGEAUGAOHCLEVELAND DISTRICT OFFICE14.0CORPORATIONCOMMITNaNNaN010
588049202009307APFFL Holdings, LLC601 CHURCH STNASHVILLETN37219CapStar Bank1201 Demonbreun StNASHVILLETN37203.078350058762530-09-20202020NaNPLPGuaranty6.00126722511.0Full-Service RestaurantsS3558COPPER BRANCHDAVIDSONTNTENNESSEE DISTRICT OFFICE5.0CORPORATIONCOMMITNaNNaN0017
588050202009307APFFL Holdings, LLC601 CHURCH STNASHVILLETN37219CapStar Bank1201 Demonbreun StNASHVILLETN37203.01500007500030-09-20202020NaNPLPGuaranty6.00120722511.0Full-Service RestaurantsS3558COPPER BRANCHDAVIDSONTNTENNESSEE DISTRICT OFFICE5.0CORPORATIONCOMMITNaNNaN0017
588051202009307ANorkobe's Beauty701 N STATE ROAD 7MARGATEFL33063Harvest Small Business Finance, LLC24422 Avenida de la CarlotaLaguna HillsCA92653.062000046500030-09-20202020NaNPLPGuaranty6.00300812112.0Beauty SalonsNaNNaNBROWARDFLSOUTH FLORIDA DISTRICT OFFICE22.0CORPORATIONCOMMITNaNNaN0016
588052202009307APrestigious Appliance LLC1118 Gary Alan TraceMoodyAL35004Progress Bank and Trust201 Williams AveHUNTSVILLEAL35801.026100019575030-09-20202020NaNOTH 7AGuaranty6.00120443141.0Household Appliance StoresNaNNaNSAINT CLAIRALALABAMA DISTRICT OFFICE3.0CORPORATIONCOMMITNaNNaN008

Duplicate rows

Most frequently occurring

AsOfDateProgramBorrNameBorrStreetBorrCityBorrStateBorrZipBankNameBankStreetBankCityBankStateGrossApprovalSBAGuaranteedApprovalApprovalDateApprovalFiscalYearFirstDisbursementDateDeliveryMethodsubpgmdescInitialInterestRateTermInMonthsNaicsCodeNaicsDescriptionFranchiseCodeFranchiseNameProjectCountyProjectStateSBADistrictOfficeCongressionalDistrictBusinessTypeLoanStatusPaidInFullDateChargeOffDateGrossChargeOffAmountRevolverStatusJobsSupported# duplicates
195202009307ACedar Investments, LLC1210 N. Stone Ave.TucsonAZ85705Bank of the West180 Montgomery StSAN FRANCISCOCA27770020827521-10-20192020NaNPLPGuaranty4.78300561421.0Telephone Answering ServicesNaNNaNPIMAAZARIZONA DISTRICT OFFICE3.0CORPORATIONCANCLDNaNNaN0008
904202009307AVan Row Mechanical, Inc.1225 Avenue CWhite CityOR97503People's Bank of Commerce1311 E Barnett RdMEDFORDOR600005100031-10-20112012NaNPATRIOT EXPatriot Express5.2512238220.0Plumbing, Heating, and Air-Conditioning ContractorsNaNNaNJACKSONORPORTLAND DISTRICT OFFICE2.0CORPORATIONCANCLDNaNNaN0148
339202009307AGONZO'S BASIC SOLUTIONS1921 EAST CARNEGIE AVE SUITESANTA ANACA92705American Plus Bank, National Association630 W Duarte RdARCADIACA44500033375003-10-20122013NaNPLPGuaranty4.50300424210.0Drugs and Druggists' Sundries Merchant WholesalersNaNNaNORANGECASANTA ANA DISTRICT OFFICE45.0CORPORATIONCANCLDNaNNaN0027
581202009307AMartin Bionics Innovations LLC214 E. MAIN STOKLAHOMA CITYOK73104BancFirst101 N Broadway, Ste 1050OKLAHOMA CITYOK25000018750011-03-20202020NaNPLPGuaranty5.7584541990.0All Other Professional, Scientific, and Technical ServicesNaNNaNOKLAHOMAOKOKLAHOMA DISTRICT OFFICE5.0CORPORATIONCOMMITNaNNaN00507
907202009307AVela Transport LLC5459 Crippen Ave SWWyomingMI49548The Huntington National Bank17 S High StCOLUMBUSOH214001070014-03-2014201401-04-2014SBA EXPRESFA$TRK (Small Loan Express)4.9060484230.0Specialized Freight (except Used Goods) Trucking, Long-DistanceNaNNaNKENTMIMICHIGAN DISTRICT OFFICE3.0CORPORATIONPIF30-04-2019NaN0047
391202009307AHappy Start, Inc.2280 Bypass 35AlvinTX77511Texas Advantage Community Bank, National Association1701 Fairway Plaza, Ste 18ALVINTX106400079800005-10-20152016NaNOTH 7AGuaranty5.50240624410.0Child Day Care ServicesNaNNaNBRAZORIATXHOUSTON DISTRICT OFFICE14.0CORPORATIONCANCLDNaNNaN0006
499202009307AL&E Transport, LLC2250 N Whistlevale Dr SWBryon CenterMI49315The Huntington National Bank17 S High StCOLUMBUSOH531002655025-01-2017201701-02-2017SBA EXPRESFA$TRK (Small Loan Express)6.5060484121.0General Freight Trucking, Long Distance, TruckloadNaNNaNKENTMIMICHIGAN DISTRICT OFFICE3.0CORPORATIONEXEMPTNaNNaN0066
502202009307ALCTC, INC.5084 NORTH FRUIT AVENUE SUITEFRESNOCA93711Premier Valley Bank255 E River Park Circle Dr, StFRESNOCA30000022500023-09-20112011NaNPLPGuaranty6.00120621610.0Home Health Care ServicesNaNNaNFRESNOCAFRESNO DISTRICT OFFICE22.0CORPORATIONCANCLDNaNNaN0036
604202009307AMopho Group, LLC514 City Park Avenue, suiteNew OrleansLA70119Gulf Coast Bank and Trust Company200 St. Charles AveNEW ORLEANSLA52000039000025-07-20132013NaNPLPGuaranty6.50120722511.0Full-Service RestaurantsNaNNaNORLEANSLALOUISIANA DISTRICT OFFICE2.0CORPORATIONCANCLDNaNNaN00296
636202009307ANorth Terrace PM LLC4344 Belleview AveKansas CityMO64111Bank of the West180 Montgomery StSAN FRANCISCOCA1000008500023-01-20182018NaNPLPGuaranty7.25120531110.0Lessors of Residential Buildingsv and DwellingsNaNNaNJACKSONMOKANSAS CITY DISTRICT OFFICE5.0CORPORATIONCANCLDNaNNaN0006